"The Natural Philosopher" wrote
| > Not that it really matters. It's pretty much all ASCII.
| >
| >
| Schrödingers cat would disagree - or ½ of him would.
|
:) I always wonder how people end up using these characters.
There are ways to do it. I can copy the character from existing
text. On Windows I think there's Charmap, though I've never
used it. Schrodinger will just have to get by without his umlaut.
Just as "naive" has survived without one.
Then there's the matter of the mechanical entry system. My
keyboard only has ASCII and a few extras.
Where this really helps is with things like Chinese. But it only
really helps them. For English speakers, we deal with pretty much all
ASCII. And that's not the 1/2 of it. As you noted, if you want
to write unicode you also need a unicode font. Browsers make
it look simple, but for general text files it's not so simple. For
example, I like to use Verdana for most text. But the font
is not unicode. Windows will display UTF-8 as ANSI.
If I visit xinhuanet.com I see Chinese characters. (Even though
it's all Greek to me.) If I check the source code I see Chinese. If
I download that and open it in my code editor as UTF-8 with
Verdana font, I see some of the languages. It looks like I'm
getting Russian and Arabic, for example. But the Chinese is all
little boxes. If I open it in Notepad, since it's plain text with no
file header, it shows as English ANSI with lots of little boxes.
So it's a good solution for webpages, but once you get into
entering, editing and storing multi-lingual text it gets very
complicated. Only for those of us who speak English is it
reasonable to say that UTF-8 makes everything easy. It does,
but only because it's usually exactly the same byte string as
ASCII. In fact, if I happen to come across
UTF-8 text or HTML code I'll generally convert it to ASCII/ANSI
for convenience. It's too much trouble trying to access it across
different programs and displays at UTF-8. On Linux, where that's
standard, it's fine. But we have to remember that this is
representational file encoding. UTF-8 by itself is no miracle.
Microsoft are one of the sites that have used UTF-8 for years.
It's all English on their English pages, but they spec it as
UTF-8, use curly quotes and UTF-8 space characters. Neither
is necessary and it complicates things. Both of these will work
with an English codepage. The first should work anywhere:
“curly nbsp; quotes”
curly quotes
--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | FidoUsenet Gateway (3:770/3)
|