Hey Holger!
HG> The idea is to convert what I see in text to/from CP437
HG> characters.
Understood but the message to you will be in utf-8 which may or may not be
convertable to cp437. There are latin based characters that do not have a
cp437 equivalent. There may be some lossiness when that occurs.
MK> Unicode -> two byte little endian -> cp437 character
MK> ====================================================
MK> U+00FC -> 0xC3 + 0xBC -> 0x81
MK> U+00DC -> 0xC3 + 0x9C -> 0x9A
MK> U+00DF -> 0xC3 + 0x9F -> 0xE1
HG> Thanks for the above, it requires a bit of translation of hex to
HG> decimal conversion
I use glibc's 'iconv' along with all the mappings in /usr/share/i18n. It seems
to me that IBM had a similar conversion routine but I don't recall offhand what
it is called. That is the simplest solution. However I also happen to have
'printf' available on the commandline and for simple hex to decimal conversion
something like this is doable;
maurice@zoltan [ ~ ]$ /usr/bin/printf "%d -> %d + %d -> %d\n" 0xfc 0xc3 0xbc
0x81
252 -> 195 + 188 -> 129
220 -> 195 + 156 -> 154
223 -> 195 + 159 -> 225
Is that better?
HG> =C3=A5 > å That is lower case 'a' with a ring on top
maurice@zoltan [ ~ ]$ /usr/bin/printf "%d\n" \'$(echo -e "\u00e5" | iconv -t
cp437)
134
HG> =C3=A4 > ä ------- " ------- 'a' with two dots above
maurice@zoltan [ ~ ]$ /usr/bin/printf "%d\n" \'$(echo -e "\u00e4" | iconv -t
cp437)
132
HG> =C3=B6 > ö ------- " ------- 'o' ------- " -------
maurice@zoltan [ ~ ]$ /usr/bin/printf "%d\n" \'$(echo -e "\u00f6" | iconv -t
cp437)
148
HG> & > & I guess this is called ampersand an will not cause any
HG> difficulties in translation
That is a 7-bit ascii character and is exactly the same in both utf-8 and
cp437, as well as almost every other charset that claims to be ascii
compatible. 0x26 or decimal 38 if you prefer.
Life is good,
Maurice
... Don't cry for me I have vi.
--- GNU bash, version 4.3.30(1)-release (x86_64-atom-linux-gnu)
* Origin: Pointy Stick Society - Ladysmith BC, Canada (1:153/7001.0)
|