TIP: Click on subject to list as thread! ANSI
echo: net_dev
to: Jonathan de Boyne Pollard
from: Leonard Erickson
date: 2000-02-22 01:24:02
subject: Multi-byte character sets

-=> Quoting Jonathan de Boyne Pollard to Leonard Erickson <=-

 LE> Frankly, I think we should just go to Unicode. 

 JdBP> By Unicode I presume you mean UCS-2.  That would mean a new PKT file
 JdBP> format, of course.  It would also be highly inefficient for text that
 JdBP> was mostly ISO 8859-1, since every other byte would be zero. 

I've been told there's a format where you give an "intro code" that IDs
the character subset, (essentially that first byte) and then only have
to use 16-byte chars for stuff that *isn't* in that set. Sort of a
"condensed mode"

Also, from what I've seen of Unicode, a message that was in full 16-bit
format and mostly *ASCII* is where the high byte would be zero. The
characters present in ISO 8859-1 that aren't present in ASCII are
spread over *several* unicode "sets".


--- FMailX 1.48a
* Origin: Shadowshack (1:105/51)
SEEN-BY: 201/0 100 200 209 300 329 400 407 411 505 600 203/600 204/450 700
SEEN-BY: 205/0 206/0 396/1 490/21 633/267 270
@PATH: 105/50 360 72 396/1 201/505 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.