Karl Williamson <public@khwilliamson.com> wrote on Wed, 17 Aug 2011 22:00:38 MDT: > It may be my turn to be mistaken. I don't see anything like that in the > current Standard; perhaps I got the impression that they were frowned > upon by off-hand remarks in the Unicode mailing list; or perhaps I > dreamt it all up. They are certainly discouraged in UTF-8 streams, where they not only serve no purpose but also interfere with catenating streams together in a chain: cat file1.utf8 file2.utf8 file3.utf8 > all.utf8 *only* works correctly when those files have no out-of-band metadata BOMs at their fronts, with the possible exception of the first. Confusion of metadata BOMs for data changes the entire length of the string. If each file has 10 characters (not counting BOMs), then the final file *must* have 30 characters (not counting BOMs). It's a simple matter of arithmetic. This is the same glaring flaw that occurs when Microsoft people create a malformed text file that doesn't end in a newline. cat file1.txt file2.txt file3.txt > all.txt If the first three files hold 10 lines apiece, then the final file *must* hold 30 lines. However, if either or both of the first two files have been negligently shorted their final newline, this is completely screwed up, and you accidentally create a single line in the output where there had been two of them in the input, and your output's line count no longer corresponds to that of your input. This is stupid. That's why you should always put a newline at the end of every text file, and why you should never put a BOM at the start of (nor anywhere in) a UTF-8 file. Sloppy Microsoft people tend to be guilty of both sins and often simultaneously, thereby needlessly making all of our lives more difficult. Just say no. --tomThread Previous | Thread Next