TIP: Click on subject to list as thread! ANSI
echo: golded
to: ANDREW CLARKE
from: MICHIEL VAN DER VLIST
date: 2015-06-06 16:21:00
subject: UTF-8

Hello andrew,

On Saturday June 06 2015 21:15, you wrote to Nicholas Boel:

 ac> It supports writing UTF-8 messages, provided a UTF-8 capable external
 ac> editor is used, as you say, but that's about it.

No, to write a message in UTF-8, you do not need to use an external editor.
What you need is the proper translation table from your local set to UTF-8.

It is reading message in UTF-8 that is problematic.


Here is my translation table from CP850 to UTF-8:
850_utf.chs


=== Cut ===
; file 850_utf8.chs
;
; This file is a charset conversion module in text form.
;
; This module Converts IBM CP850 characters to UTF-8 characters.
;
; By Michiel van der Vlist, 2:280/5555
;
; This is a modified version that translates the currency symbol € (Alt 207)
; (hex CF) into the Euro sign.
;
; Format: ID, version, level,
;         from charset, to charset,
;         128 entries: first, second and optional third byte
;         "END"
; Lines beginning with a ";" or a ";" after the entries are comments
;
;
; cedilla = ,   ; dieresis = ..       ; acute = '
; grave = `     ; circumflex = ^      ; ring = o
; tilde = ~     ; caron = v
; All of these are above the character, apart from the cedilla which is below.
;
; \ is the escape character: \0 means decimal zero,
; \dnnn where nnn is a decimal number is the ordinal value of the character
; \xnn where nn is a hexadecimal number
; e.g.: \d32 is the ASCII space character
; Two \\ is the character "\" itself.
;
4                 ; ID number
1                 ; version number
;
2 4               ; level number
;
CP850             ; from set
UTF-8             ; to set
;                 ; dec hx description
;
\xC3 \x87         ; 128 80 latin capital letter c with cedilla
\xC3 \xBC         ; 129 81 latin small letter u with diaeresis
\xC3 \xA9         ; 130 82 latin small letter e with acute
\xC3 \xA2         ; 131 83 latin small letter a with circumflex
\xC3 \xA4         ; 132 84 latin small letter a with diaeresis
\xC3 \xA0         ; 133 85 latin small letter a with grave
\xC3 \xA5         ; 134 86 latin small letter a with ring above
\xC3 \xA7         ; 135 87 latin small letter c with cedilla
\xC3 \xAA         ; 136 88 latin small letter e with circumflex
\xC3 \xAB         ; 137 89 latin small letter e with diaeresis
\xC3 \xA8         ; 138 8A latin small letter e with grave
\xC3 \xAF         ; 139 8B latin small letter i with diaeresis
\xC3 \xAE         ; 140 8C latin small letter i with circumflex
\xC3 \xAC         ; 141 8D latin small letter i with grave
\xC3 \x84         ; 142 8E latin capital letter a with diaeresis
\xC3 \x85         ; 143 8F latin capital letter a with ring above
\xC3 \x89         ; 144 90 latin capital letter e with acute
\xC3 \xA6         ; 145 91 latin small letter ae
\xC3 \x86         ; 146 92 latin capital letter ae
\xC3 \xB4         ; 147 93 latin small letter o with circumflex
\xC3 \xB6         ; 148 94 latin small letter o with diaeresis
\xC3 \xB2         ; 149 95 latin small letter o with grave
\xC3 \xBB         ; 150 96 latin small letter u with circumflex
\xC3 \xB9         ; 151 97 latin small letter u with grave
\xC3 \xBF         ; 152 98 latin small letter y with diaeresis
\xC3 \x96         ; 153 99 latin capital letter o with diaeresis
\xC3 \x9C         ; 154 9A latin capital letter u with diaeresis
\xC3 \xB8         ; 155 9B latin small letter o with stroke
\xC2 \xA3         ; 156 9C pound sign
\xC3 \x98         ; 157 9D latin capital letter o with stroke
\xC3 \x97         ; 158 9E multiplication sign
\xC6 \x92         ; 159 9F dutch guilder sign (ibm437 159) (f with hook)
\xC3 \xA1         ; 160 A0 latin small letter a with acute
\xC3 \xAD         ; 161 A1 latin small letter i with acute
\xC3 \xB3         ; 162 A2 latin small letter o with acute
\xC3 \xBA         ; 163 A3 latin small letter u with acute
\xC3 \xB1         ; 164 A4 latin small letter n with tilde
\xC3 \x91         ; 165 A5 latin capital letter n with tilde
\xC2 \xB8         ; 166 A6 feminine ordinal indicator
\xC2 \xBA         ; 167 A7 masculine ordinal indicator
\xC2 \xBF         ; 168 A8 inverted question mark
\xC2 \xAE         ; 169 A9 registered sign
\xC2 \xAC         ; 170 AA not sign
\xC2 \xBD         ; 171 AB vulgar fraction one half
\xC2 \xBC         ; 172 AC vulgar fraction one quarter
\xC2 \xA1         ; 173 AD inverted exclamation mark
\xC2 \xAB         ; 174 AE left-pointing double angle quotation mark
\xC2 \xBB         ; 175 AF right-pointing double angle quotation mark
\xE2 \x96 \x91    ; 176 B0 light shade
\xE2 \x96 \x92    ; 177 B1 medium shade
\xE2 \x96 \x93    ; 178 B2 dark shade
\xE2 \x94 \x82    ; 179 B3 box drawings light vertical
\xE2 \x94 \xA4    ; 180 B4 box drawings light vertical and left
\xC3 \x81         ; 181 B5 latin capital letter a with acute
\xC3 \x82         ; 182 B6 latin capital letter a with circumflex
\xC3 \x80         ; 183 B7 latin capital letter a with grave
\xC2 \xA9         ; 184 B8 copyright sign
\xE2 \x95 \xA3    ; 185 B9 box drawings heavy vertical and left
\xE2 \x95 \x91    ; 186 BA box drawings heavy vertical
\xE2 \x95 \x97    ; 187 BB box drawings heavy down and left
\xE2 \x95 \x9D    ; 188 BC box drawings heavy up and left
\xC2 \xA2         ; 189 BD cent sign
\xC2 \xA5         ; 190 BE yen sign
\xE2 \x94 \x90    ; 191 BF box drawings light down and left
\xE2 \x94 \x94    ; 192 C0 box drawings light up and right
\xE2 \x94 \xB4    ; 193 C1 box drawings light up and horizontal
\xE2 \x94 \xAC    ; 194 C2 box drawings light down and horizontal
\xE2 \x94 \x9C    ; 195 C3 box drawings light vertical and right
\xE2 \x94 \x80    ; 196 C4 box drawings light horizontal
\xE2 \x94 \xBC    ; 197 C5 box drawings light vertical and horizontal
\xC3 \xA3         ; 198 C6 latin small letter a with tilde
\xC3 \x83         ; 199 C7 latin capital letter a with tilde
\xE2 \x95 \x9A    ; 200 C8 box drawings heavy up and right
\xE2 \x95 \x94    ; 201 C9 box drawings heavy down and right
\xE2 \x95 \xA9    ; 202 CA box drawings heavy up and horizontal
\xE2 \x95 \xA6    ; 203 CB box drawings heavy down and horizontal
\xE2 \x95 \xA0    ; 204 CC box drawings heavy vertical and right
\xE2 \x95 \x90    ; 205 CD box drawings heavy horizontal
\xE2 \x95 \xAC    ; 206 CE box drawings heavy vertical and horizontal
;\xC2 \xA4         ; 207 CF currency sign
\xE2 \x82 \xAC    ; 207 CF Euro sign.
\xC3 \xB0         ; 208 D0 latin small letter eth (icelandic)
\xC3 \x90         ; 209 D1 latin capital letter eth (icelandic)
\xC3 \x8A         ; 210 D2 latin capital letter e with circumflex
\xC3 \x8B         ; 211 D3 latin capital letter e with diaeresis
\xC3 \x88         ; 212 D4 latin capital letter e with grave
\xC4 \xB1         ; 213 D5 latin small letter i dotless
\xC3 \x8D         ; 214 D6 latin capital letter i with acute
\xC3 \x8E         ; 215 D7 latin capital letter i with circumflex
\xC3 \x8F         ; 216 D8 latin capital letter i with diaeresis
\xE2 \x94 \x98    ; 217 D9 box drawings light up and left
\xE2 \x94 \x8C    ; 218 DA box drawings light down and right
\xE2 \x96 \x88    ; 219 DB full block
\xE2 \x96 \x84    ; 220 DC lower half block
\xC2 \xA6         ; 221 DD broken bar
\xC3 \x8C         ; 222 DE latin capital letter i with grave
\xE2 \x96 \x80    ; 223 DF upper half block
\xC3 \x93         ; 224 E0 latin capital letter o with acute
\xC3 \x9F         ; 225 E1 latin small letter sharp s (german)
\xC3 \x94         ; 226 E2 latin capital letter o with circumflex
\xC3 \x92         ; 227 E3 latin capital letter o with grave
\xC3 \xB5         ; 228 E4 latin small letter o with tilde
\xC3 \x95         ; 229 E5 latin capital letter o with tilde
\xC2 \xB5         ; 230 E6 greek small letter mu
\xC3 \x9E         ; 231 E7 latin capital letter thorn (icelandic)
\xC3 \xBE         ; 232 E8 latin small letter thorn (icelandic)
\xC3 \x9A         ; 233 E9 latin capital letter u with acute
\xC3 \x9B         ; 234 EA latin capital letter u with circumflex
\xC3 \x99         ; 235 EB latin capital letter u with grave
\xC3 \xBD         ; 236 EC latin small letter y with acute
\xC3 \x9D         ; 237 ED latin capital letter y with acute
\xC3 \x8F         ; 238 EE em dash
\xC2 \xB4         ; 239 EF acute accent
\xC2 \xAD         ; 240 F0 soft hyphen
\xC2 \xB1         ; 241 F1 plus-minus sign
\xE2 \x80 \x97    ; 242 F2 left right double arrow
\xC2 \xBE         ; 243 F3 vulgar fraction three quarters
\xC2 \xB6         ; 244 F4 pilcrow sign
\xC2 \xA7         ; 245 F5 section sign
\xC3 \xB7         ; 246 F6 division sign
\xC2 \xB8         ; 247 F7 ogonek
\xC2 \xB0         ; 248 F8 degree sign
\xC2 \xA8         ; 249 F9 diaeresis
\xC2 \xB7         ; 250 FA middle dot
\xC2 \xB9         ; 251 FB superscript one
\xC2 \xB3         ; 252 FC superscript three
\xC2 \xB2         ; 253 FD superscript two
\xE2 \x96 \xA0    ; 254 FE black square
\xC2 \xA0         ; 255 FF no-break space
END
=== Cut ===


Cheers, Michiel

--- GoldED+/W32-MSVC 1.1.5-b20130111
* Origin: http://www.vlist.eu (2:280/5555)

SOURCE: echomail via QWK@docsplace.org

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.