TIP: Click on subject to list as thread! ANSI
echo: ftsc_public
to: Maurice Kinal
from: Ozz Nixon
date: 2019-06-28 21:23:00
subject: Re: not all is lost but f

On 2019-06-28 02:01:09 +0000, Maurice Kinal -> Torsten Bamberg said:

FTN Header versus actual message body conveying Unicode.

When I telnet to a SQL server that speaks Unicode only, it always 
returns the following characters (pascal): #239#187#191

When I telnet to a web page that speaks Unicode, it too returns 
#239#187#191 plus the  etc.

So... would it not stand true that systems that are posting UTF8 do the 
same introduction on the message body? Then authors *know* it 
potentially has Unicode and leave it damn well alone, and also parse it 
based upon UTF8 instead of 8bit char...

This is how I am coding things here, just based upon NexusSQL, 
PremierSQL, MS SQL, Apache and Nexus Web Service. I do not have access 
to my Oracle box nor the MySQL 5 server to see if they do the same 
during the initial connection negotiation(s).

A quick google: It's the utf8 byte order mark. Some editors save the 
BOM inside the file (in order to be used as a header) which regularly 
causes confusion because it is optional.

So, if we wanted to help enforce at a reader (or even tosser level) how 
to handle, I would offer this up as a required BOM to the message body 
that is UTF8.

Ozz

--- ExchangeBBS NNTP Server v3.1/Linux64
       
* Origin: (1:1/123)

SOURCE: echomail via QWK@dmine.net

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.