TIP: Click on subject to list as thread! ANSI
echo: osdebate
to: Adam
from: Rich
date: 2006-06-19 15:49:44
subject: Re: PCI hardware ID

From: "Rich" 

This is a multi-part message in MIME format.

------=_NextPart_000_01CC_01C693B7.FCB22770
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

    XML is easy.  Valid XML must declare the encoding so there is no =
ambiguity.

Rich

  "Adam" <""4thwormcastfromthemolehill\"{at}the
field.near the bridge"> =
wrote in message news:449724d4$1{at}w3.nls.net...
  Rich wrote:
  >    Two different issues.  IE has to apply heuristics to file types
  > because the servers that return this content often return bogus, =
both
  > incorrect or invalid, types.  It's not easy to fix.  IE had to do =
this
  > because netscape didn't enforce types and to be compatible IE =
couldn't
  > either.  Because the types aren't enforced lots of servers still do =
this
  > wrong.  Because they do types can't be enforced.
  > =20
  >    Text and Unicode is first a false distinction.  It's all text.  =
In
  > the case of Notepad, you mean UTF-16 text vs. UTF-8 text vs. ANSI =
text
  > as this is the distinction that Notepad makes on load.  Even that =
misses
  > the complexity as what people call ANSI is actually any of 14 =
distinct
  > and incompatible ANSI encodings and is often one of many OEM =
encodings
  > which may be distinct from any of the ANSI ones.  It is complicated
  > because for many of the ANSI encodings including the one used in the
  > U.S. and Western Europe, anything could be valid.  Because of this =
it is
  > not always possible to make a distinction between UTF-16 and ANSI as =
a
  > file could validly be either.  UTF-8 is restrictive so it is easy to
  > tell if something is valid UTF-8.  That could still be a problem as
  > valid UTF-8 could be valid ANSI too.  Instead some heuristics are
  > applied.  For example if you see 0D 00 0A 00 then the file is =
probably
  > UTF-16 while if you see 0D 0A it may be ANSI though U+0A0D might be =
a
  > valid Unicode character.  I didn't look.
  > =20

  Yeah hidden chars esp in XML can be a bitch.

  Especially when XML chars rendered from html streams (file or other)
  processed from CDATA.

  Roll on XHTML.

  Adam
------=_NextPart_000_01CC_01C693B7.FCB22770
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable








   
XML is easy.  =
Valid XML=20
must declare the encoding so there is no ambiguity.
 
Rich
 

  "Adam" <""4thwormcastfromthemole=">mailto:"4thwormcastfromthemolehill\"{at}the'>"4thwormcastfromthemole=
hill\"{at}the=20
  field.near the bridge"> wrote in message news:449724d4$1{at}w3.nls.net...Rich=20
  wrote:>    Two different
issues.  IE has to =
apply=20
  heuristics to file types> because the servers that return this =
content=20
  often return bogus, both> incorrect or invalid, types.  =
It's not=20
  easy to fix.  IE had to do this> because
netscape didn't =
enforce=20
  types and to be compatible IE couldn't>
either.  Because =
the types=20
  aren't enforced lots of servers still do this> wrong.  =
Because=20
  they do types can't be enforced.>  =
>    Text=20
  and Unicode is first a false distinction.  It's all text. =20
  In> the case of Notepad, you mean UTF-16 text vs. UTF-8 text =
vs. ANSI=20
  text> as this is the distinction that Notepad makes on =
load.  Even=20
  that misses> the complexity as what people call ANSI is =
actually any of=20
  14 distinct> and incompatible ANSI encodings and is often one =
of many=20
  OEM encodings> which may be distinct from any of the ANSI =
ones. =20
  It is complicated> because for many of the ANSI encodings =
including the=20
  one used in the> U.S. and Western Europe, anything could be=20
  valid.  Because of this it is> not always
possible to make =
a=20
  distinction between UTF-16 and ANSI as a> file could validly be =

  either.  UTF-8 is restrictive so it is easy to> tell if =
something=20
  is valid UTF-8.  That could still be a problem
as> valid =
UTF-8=20
  could be valid ANSI too.  Instead some heuristics are>=20
  applied.  For example if you see 0D 00 0A 00 then the file is=20
  probably> UTF-16 while if you see 0D 0A it may be ANSI though =
U+0A0D=20
  might be a> valid Unicode character.  I didn't =
look.> =20
  Yeah hidden chars esp in XML can be a
bitch.Especially =
when=20
  XML chars rendered from html streams (file or other)processed from =

  CDATA.Roll on
XHTML.Adam

------=_NextPart_000_01CC_01C693B7.FCB22770--

--- BBBS/NT v4.01 Flag-5
* Origin: Barktopia BBS Site http://HarborWebs.com:8081 (1:379/45)
SEEN-BY: 633/267 270
@PATH: 379/45 1 106/2000 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.