TIP: Click on subject to list as thread! ANSI
echo: ftsc_public
to: Maurice Kinal
from: Ozz Nixon
date: 2019-06-28 21:23:36
subject: Re: not all is lost but far too much for far too long

On 2019-06-28 02:01:09 +0000, Maurice Kinal -> Torsten Bamberg said:

FTN Header versus actual message body conveying Unicode.

When I telnet to a SQL server that speaks Unicode only, it always 
returns the following characters (pascal): #239#187#191

When I telnet to a web page that speaks Unicode, it too returns 
#239#187#191 plus the  etc.

So... would it not stand true that systems that are posting UTF8 do the 
same introduction on the message body? Then authors *know* it 
potentially has Unicode and leave it damn well alone, and also parse it 
based upon UTF8 instead of 8bit char...

This is how I am coding things here, just based upon NexusSQL, 
PremierSQL, MS SQL, Apache and Nexus Web Service. I do not have access 
to my Oracle box nor the MySQL 5 server to see if they do the same 
during the initial connection negotiation(s).

A quick google: It's the utf8 byte order mark. Some editors save the 
BOM inside the file (in order to be used as a header) which regularly 
causes confusion because it is optional.

So, if we wanted to help enforce at a reader (or even tosser level) how 
to handle, I would offer this up as a required BOM to the message body 
that is UTF8.

Ozz

--- ExchangeBBS NNTP Server v3.1/Linux64
* Origin: (1:1/123)
SEEN-BY: 14/5 15/2 103/705 154/10 203/0 221/0 226/17 227/114 229/123 200 354
SEEN-BY: 229/426 452 1014 240/5832 249/206 317 400 261/38 280/464 5003 5006
SEEN-BY: 280/5555 292/854 310/31 317/3 322/757 342/200 393/68 396/45 423/120
SEEN-BY: 633/267 280 640/1384 712/620 848 770/1 2452/250 5020/545
@PATH: 1/123 229/426 280/464 712/848 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.