Hello mark,
On Tuesday June 09 2020 20:25, you wrote to me:
MvdV>> Instead of an 'H' we see an 'i' with accent grave. The infamous
MvdV>> hex 8D in CP437.
Note the use of te word "infamous".
ml> that partucular character has problems several ways... in this case,
ml> though, it is known as the soft_cr...
I know that
This what I wrote in Fidonews Vol 29, nr 1 in an article titled "A PLEA FOR
UTF-8 IN FIODONET Part 2" (Part one was published in Vol 28, nr 52)
=== quote ===
So what else would we need to make FidoNet work with UTF-8? As far as
the transport layer is concerned nothing really. FidoNet is fully 8
bit transparent except for the NULL as the terminating character for
strings. There is no conflict as in UTF-8 the NULL character has the
exact same meaning as in ASCII. Oh wait, there is this tiny little
snag: the archaic soft return. In their infinite wisdom, the founding
fathers decided that the character 0x8D had special meaning; that of
soft return. Probably a remnant from the Wordstar days. In hindsight
totally superfluous and a conflict with many code page schemes that
treat it as a printable character. It also conflicts with UTF-8. 0x8D
is a valid byte in a well formed UTF-8 string. Fortunately most bronze
age software allows configuring 0x8D as a printable character instead
of soft return, so this should no longer be a problem. Be sure however
to configure your tosser to not strip soft returns.
=== unquote ===
ml> so if we strip it, then we break other languages... german is one,
German is hardly affected. The most used encodings in Germany are CP850 and
Latin-1. In CP850 it is the 'i' with accent grave which is not used in German.
In Latin-1 it is a control code R1, whatever that means. The Russians are most
affected as in Cyrillic CP866 it is the capital letter
pronunced as the 'N' in "Putin" as I already explained.
In UTF-8 it affects more than one language as it can occur in a well formed
UTF-8 sequence.
ml> IIRC... if we leave it, some of today's readers/editors will show it
ml> instead of word wrapping on it and not displaying it at all...
Those reader have long been fased out in this part of the world. Mainly because
it is considered a printable character by most. Golded has the option:
DISPSOFTCR yes
Tossers should never strip it. Period. Althoug Fmail still has an option to do
so. Possibly for historic reasons.
ml> this BBS editor has it as an option... i'll flip it after writing this
ml> reply and we'll see how my future messages look...
Your editor did not strià it or I would not have seen the 'i' with ccent gr ve
in your mess ge.
But you re ignoring the àoint. Which w s that your softw re wrongly m rks your
reply to my message as CP437. It should haven been CP866.
I h ve hidden, some more e ster eggs, this time not involving hex 8D.
Enjoy.
Cheers, Michiel
--- GoldED+/W32-MSVC 1.1.5-b20170303
* Origin: http://www.vlist.eu (2:280/5555)
|