Hola Holger!
HG> OK, the code 218 128 162 that i interpreted as hyphen actually
HG> is the longer 'dash'.
I am not sure what you mean but using 218 (DA) as the leading byte means you
are restricted to a 2 byte or 16 bit character and not a 24 bit character that
is required for euro sign in utf8. The way the leading byte works is like
this;
dec 218 = bin 11011010
^
The first zero shows that there are two leading ones which means there is only
one trailing byte following. So that means either 218 128 and 162 is ignored.
A 24 bit character *must* be prefixed by at least 11100000 which is dec 224 or
E0. For the utf8 euro character the prefix is;
dec 226 = bin 11100010
^
and as you can see the first zero yields three leading ones which is three
bytes or 24 bits.
For the record 218 128 is U+0680 which we already know to be a 16 bit Arabic
character. Also for the record is that all trailing byte(s) must be in the
range of 80 - BF or dec 128 to dec 191 which both of your posted trailing bytes
are despite the leading byte could only use one.
HG> God natt min vän
Thank you. Buenas noches mi amigo. :-)
La vida es buena,
Maurice
... Un Møøse una vez mordió a mi hermana ...
--- GNU bash, version 5.0.2(1)-release (aarch64-raspi3b+-linux-gnu)
* Origin: Little Mikey's EuroPoint - Ladysmith BC, Canada (2:280/464.113)
|