so we have several google docs for the script of kitsune tails. i download them as txt files, and a batch file concatenates them with the type command. then i feed the concatenated full script to our script tool
anyway turns out when you download txt files from google docs they stick a fucking byte order mark in em. so concatenating them just sticks random BOMs all through your document. i never noticed because of a fluke of the tool
so anyway now i've added code to strip BOMs from the start of any line but jesus christ this is why there should not be byte order marks, ever, and if your software is putting them into plaintext files in this, the year of our lord 2024 then you deserve to get reality checked right into the pavement
@eniko As an example, if a UTF-8 encoded CSV has no BOM, then Excel won’t read the non-ASCII characters in it correctly, because it assumes that the text uses the system’s default code page instead. No matter what, you’ll inflict pain on a lot of people, so the question is rather which option causes less trouble overall. I’d wager it’s best to do whatever plays nice with MS Office…
@eniko For UTF-8, I used to think it was useful, back in the early Unicode days when the web was basically made of extended ASCII text.
Now, though, it's annoying whenever it shows up because mostly everything is UTF-8 and now I have to have code to deal with "is there a specific 3 bytes at the top of a text file"
@JoshJers i'm using a batch file to concatenate text files downloaded from google docs and of course google puts a BOM in each text file and the TYPE command just keeps it in there so now i have random byte order marks in my resulting file
fortunately i run that through another tool which can filter them back out but jesus christ it didn't have to be this way
Add comment