TIP: Click on subject to list as thread! ANSI
echo: os2
to: Leonard Erickson
from: George White
date: 1999-09-13 09:44:00
subject: Character sets

Hi Leonard,

You wrote to me about my message to Murray Lesser:

LE> -=> Quoting George White to Murray Lesser <=-

LE> ML>    Nowadays, the Average Idiot Home User hasn't heard of any character
LE> ML>code other than ASCII, if even that :-(.  As you well know, the stupid
LE> ML>collating sequence of ASCII is due to the great desire of AT&T to pack
LE> ML>everything they thought anybody would ever want into seven bits, while
LE> ML>letting the presence of a single bit differentiate between lower- and
LE> ML>upper-case alpha characters (a desirable characteristic only for fully
LE> ML>mechanical terminals).

LE>ASCII *was* defined in 1963 (or was it 68?) you know. It was originally
LE>intended as a standard for moving data between different brands of
LE>mainframes (which all had their *own* character sets back then)

I hope Murray sees this, I wasn't interested in this aspect of the
message. As you have raised it with me, according to a reference I have
here (C Programmers Guide to Serial Communications by Joe Campbell),
the official title of the ASCII specification is "ANSI Standard
X3.4-1977 (Revised 1983), Code for Information Interchange", so from
this I believe it's more recent than the dates you remember.

LE>Also, that uppercase/lower case distinction being on bit was important
LE>for programmers writing *tight* code. I used to use that single bit
LE>trick in programs back when 16k or DRAM cost several hundred dollars.

It certainly can make life easier for the programmer, and I've taken
advantage of it in my own code :-).

LE> ML>Making sense (from a computer-architectural
LE> ML>standpoint) has never been a requirement for standards committees!  (I
LE> ML>have a great dislike, based on many years of past participation, of
most
LE> ML>computer-related standardizing activities.  The standards committee
LE> ML>members seem to be more interested in showing how smart they are than
in
LE> ML>following Euripides' "legacy" law of computer architecture:  "The gods
LE> ML>visit the sins of the fathers upon the children.")

LE> GW> I think of the ANSI screen control sequences as a classic example of
LE> GW> that "cleverness", even though they are really DEC terminal control
LE> GW> sequences.

LE>They are derived from DEC control sequences for the VT-52. The ESC was
LE>turned into ESC[ so that DEC wouldn't have an unfair advantage in the
LE>market. The VT-100 came *after* the X3.64 standard was defined.

That counts in my book as "really". just changing "ESC" to "ESC[" does
nothing to change the basis of the control sequences. How a change as
simple to incorporate as that can be considered as removing an "unfair
advantage" when Joe Campbell's (Op Cit) description of writing the code
to interpret it (an input driver) is "a Herculean labor" (it's an
American book and I'm retaining the American spelling used therein). The
full quote on programming for ANSI is:

  "From the programmer's point of view, writing a driver for ANSI output
is slightly more difficult than for other kinds of terminals because
ANSI expresses all numeric parameters as ASCII digits instead of binary
numbers. But if ANSI output is not particularly difficult, writing code
for an ANSI input driver is a Herculean labor. Because the identifying
code (the "Final") occurs last or next to last (if an Intermediate is
present), there is no way to ascertain at the beginning of a control
sequence how long it will be. Remember, there may be a variable number
of ASCII numeric parameters and/or selective parameters. If necessary
parameters are missing from the control string, defaults must be
supplied. All input from the CSI through the Final (or
Final-Intermediate combination) must therefor be buffered, then parsed
into functions and parameters."

LE> ML>    As to why the PC architecture included the use of ASCII as an
LE> ML>internal character code: since I had nothing to do with the IBM PC
LE> ML>development, I have only some conjectures.  The PC was developed by a
LE> ML>semi-independent group in Florida.  The then upper management of IBM
LE> ML>didn't believe it was ever going to go anywhere, so probably didn't
care

LE> GW> Original estimate was 250,000 units over 5 years according to an
LE> GW> article in Byte in 1990. It should have been dead, buried, and history
LE> GW> by now...

LE> ML>that the original perpetrators were using their 8-bit version of ASCII
LE> ML>instead of EBCDIC!  The character-code choice may have had something to
LE> ML>do with the fact that the 7-bit version of ASCII (embedded in eight
LE> ML>bits) was being used in most desktop machines up to that date.  (My
CP/M
LE> ML>machine, vintage 1979, used a so-called ASCII terminal that had no way
LE> ML>to input the high-order eighth bit.)  Only some word processors of the
LE> ML>time used that bit for anything.

LE>Excuse me, but you've made the same mistake 3 times so far. There is no
LE>such thing as "8-bit" ASCII nor a "the 7-bit version of ASCII". ASCII
LE>is *defined* as a 7-bit set. Any 8-bit set with the lower 128
LE>characters matching ASCII is an "extended ASCII" and *not* any sort of
LE>official variant of ASCII.

I haven't made any mistakes I'm aware of, Murray might have but I'll let
him defend himself - he's quite well able to.
I am well aware that ASCII is a purely 7 bit code, and that the normal 8
bit IBM PC character set includes _most_ (but definitely not all) of the
7 bit ASCII character set. I've been programming using ASCII since
before the IBM PC was even thought of...

LE>It *can* be a standard in its own right, such as the various ISO
LE>8859-xx character sets.

Agree totally, but again they are definitely _NOT_ ASCII.

LE> ML>(to some extent) "upward compatible" from CP/M.  IMO, the PC fathers
did
LE> ML>the best that they could to make ASCII barely usable, supplying both
the
LE> ML>missing "upper" 128 characters and also text-mode graphic symbols for
LE> ML>the otherwise useless (by then) ASCII control characters.

LE>ASCII *has* no "upper 128".

Sort this out with Murray, I totally agree with you.

LE>And by the time the PC was being designed it was quite clear that the
LE>32-126 range had to be the same as ASCII for a machine to do well in
LE>the market, and that the more common control chars had to be supported.
LE>*Why* they made DEL a printable char I have no idea.

I don't have either...

George
___
 X SLMR 2.1a X Study the past, if you would divine the future.

--- Maximus/2 3.01
* Origin: Air Applewood, OS/2 Gateway to Essex 44-1279-792300 (2:257/609)

SOURCE: echoes via The OS/2 BBS

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.