| TIP: Click on subject to list as thread! | ANSI |
| echo: | |
|---|---|
| to: | |
| from: | |
| date: | |
| subject: | Re: not all is lost but far too much for far too long |
Re: Re: not all is lost but far too much for far too long By: Ozz Nixon to Maurice Kinal on Fri Jun 28 2019 09:23 pm > On 2019-06-28 02:01:09 +0000, Maurice Kinal -> Torsten Bamberg said: > > FTN Header versus actual message body conveying Unicode. > > When I telnet to a SQL server that speaks Unicode only, it always > returns the following characters (pascal): #239#187#191 Using telnet to connect to services that don't speak Telnet is generally a bad idea. Use netcat (nc) instead. > When I telnet to a web page that speaks Unicode, it too returns > #239#187#191 plus the etc. > > > So... would it not stand true that systems that are posting UTF8 do the > same introduction on the message body? Then authors *know* it > potentially has Unicode and leave it damn well alone, and also parse it > based upon UTF8 instead of 8bit char... It's an idea. But that's not how *other* charsets/encodings work and certainly not how MIME-encoded messages (e.g. email) works - header fields are used instead. > This is how I am coding things here, just based upon NexusSQL, > PremierSQL, MS SQL, Apache and Nexus Web Service. I do not have access > to my Oracle box nor the MySQL 5 server to see if they do the same > during the initial connection negotiation(s). > > A quick google: It's the utf8 byte order mark. That's actually a misnomer (there is not "byte order" in UTF-8). The actual unicode code point is Zero Width No-Break Space: https://www.compart.com/en/unicode/U+FEFF > Some editors save the > BOM inside the file (in order to be used as a header) which regularly > causes confusion because it is optional. > > So, if we wanted to help enforce at a reader (or even tosser level) how > to handle, I would offer this up as a required BOM to the message body > that is UTF8. And why is that better than a header field ("control paragraph" as defined in FTS-5003) which indicates UTF-8? --- SBBSecho 3.07-Linux* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705) SEEN-BY: 103/705 154/10 203/0 218/700 221/0 229/426 240/5832 261/38 280/464 SEEN-BY: 280/5003 5006 5555 292/854 310/31 396/45 423/120 633/267 280 640/1384 SEEN-BY: 712/620 848 770/1 2452/250 5020/545 @PATH: 103/705 280/464 712/848 633/267 |
|
| SOURCE: echomail via fidonet.ozzmosis.com | |
Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.