TIP: Click on subject to list as thread! ANSI
echo: rberrypi
to: TIMS
from: MAYAYANA
date: 2021-02-13 14:42:00
subject: Re: LXTerm to accept ANSI

"TimS"  wrote

| >   UTF-8 allows ANSI character sets to still be used. But it also
| > provides a way to fully support multi-byte characters only
| > where necessary. It's the one solution to support all languages
| > without changing the default of 1 character to 1 byte.
|
| It's only a default for ASCII, and the characters that ASCII supports. And
| when you say it allows ANSI character sets to be used, I take it you mean
the
| characters that different ANSI pages supported, which under UTF-8 will
most
| likely be 2-byte chars, rather than 1-byte but 8-bit values.
|

   Most ANSI character sets are also 1 byte to 1 character.
It's only the DBCS languages that can't fit that model.
So first we had ASCII. Then we had ANSI with codepages,
and most languages could be fully represented in HTML
using META content type. **All of that is 1 byte to 1
character.** Only the DBCS languages were an exception.
And they used a system similar to UTF-8.

    So it didn't require any fundamental change
in character encoding, editors, or file formats. So-called
wide character encoding, with 2 or more bytes per character,
existed, but was not really used. 1 byte/ 1 character was
nearly universal.

    So the only reason UTF-8 was needed was to fully
accommodate DBCS languages, pile-of-shit emojis, etc.
Most English pages are essentially ASCII, which is UTF-8
conforming. And charset can be specified for ANSI
interpretation.

  So all I was saying was that UTF-8 was far easier than
any other approach, using "wide characters", when it came
time to fully support all languages under one system. Even now
I'm not sure how much it's really used. Browsers properly
display curly quotes, but I actually only have one unicode
font on my system, which is arial uncode MS, weighing in at
24 MB. Nothing else will render most UTF-8 characters. For example,
the RichEdit window in Windows has supported UTF-8 for
some time. And I can use the ability in my own software.
But it will only render if I use that Arial unicode font. With
any other font it renders as ANSI using the English codepage.
Just as a browser will do if charset isn't specced to be UTF-8.
(Though UTF-8 may be default these days. I don't know. I
still use:


Not that it really matters. It's pretty much all ASCII.

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | FidoUsenet Gateway (3:770/3)

SOURCE: echomail via QWK@docsplace.org

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.