At 02:55 PM 2/9/00 -0500, Mark Mielke wrote: >On Wed, Feb 09, 2000 at 11:15:34AM -0800, "Larry Wall" wrote: > > Gurusamy Sarathy writes: > > : On Wed, 09 Feb 2000 10:19:38 PST, Larry Wall wrote: > > : > use byte; # old perl semantics > > : Hey, shouldn't we s/byte/bytes/ like we did for warnings? > > Yes, I cringe every time I write "use byte". > > I keep wondering if there's something out there better than "bytes", though. > >... > > octets > >... > >I prefer octets, as that is what all the RFC's seem to call them. This >would be similar to "use integer;" for numbers, "use octet;" for strings >would limit it to 8 bit characters. (Note that it isn't "use integers;", >it is "use integer;") > >I'd stay away from calling them ascii or latin1, as they are merely >code dictionaries mapping codes to symbols. Just because ascii only >defines up to 128 codes, doesn't mean that it cannot be represented in >a any 7+ bit integer. Proof? UTF-8 encompasses ascii. Ascii is a >subset of UTF-8. > >Use of ascii, latin1, etc. should be reserved for character >conversions. Just because ascii happens to be a subset of latin1, and >a subset of UTF-8 is mostly just a coincidence. The functionality >provided by the current "use byte;" is that any such assumptions about >encodings are discarded, and all strings are viewed plainly as a >string of octets with complete disregard to encoding. hmmm... If its all about character sets / encodings... use charset 'ascii'; use charset 'utf8'; use charset 'unicode'; no charset; # Or... use charset 'none'; # Implies 8 bits? Or, maybe, s/charset/encoding/ --Gregor +--------------------------------------------------------------+ | Gregor N. Purdy gregor@focusresearch.com | | | | Swiss army chainsaw operator. y2k: perl -pe 'tr/yY/kK/' | +--------------------------------------------------------------+