Widget HTML Atas

Æ—¥æœ¬æ©‹ É«˜æž¶ ĸ‹ R Ȩˆç”» Download

I18n QA Logo

UTF-8 Encoding Debugging Chart

Here is a Encoding Problem Chart that aids in debugging common UTF-8 character encoding problems. See these 3 typical problem scenarios that the chart can help with.

  • Encoding Problem 1: Treating UTF-8 Bytes as Windows-1252 or ISO-8859-1
  • Encoding Problem 2: Incorrect Double Mis-Conversion
  • Encoding Problem 3: ISO-8859-1 vs Windows-1252

Debugging Chart Mapping Windows-1252 Characters to UTF-8 Bytes to Latin-1 Characters

The following chart shows the characters in Windows-1252 from 128 to 255 (hex 80 to FF). The Unicode code point for each character is listed and the hex values for each of the bytes in the UTF-8 encoding for the same characters. These UTF-8 bytes are also displayed as if they were Windows-1252 characters. You can use this chart to debug problems where these sequences of Latin characters occur, where only one character was expected. If you match the sequence that occurs to the sequence in the chart, and the expected value in the chart matches the value that you expected to see, then the problem is being caused by UTF-8 bytes being interpreted as Windows-1252 (or ISO 8859-1) bytes. See Encoding Problem: Treating UTF-8 Bytes as Windows-1252 or ISO-8859-1 for a more detailed explanation.

Table for Debugging Common UTF-8 Character Encoding Problems.
Code Point Characters UTF-8 Bytes Code Point Characters UTF-8 Bytes
Unicode Windows
1252
Expected Actual Unicode Windows
1252
Expected Actual
U+20AC 0x80 € %E2 %82 %AC U+00C0 0xC0 À À %C3 %80
0x81 U+00C1 0xC1 Á Ã %C3 %81
U+201A 0x82 ‚ %E2 %80 %9A U+00C2 0xC2  Â %C3 %82
U+0192 0x83 ƒ Æ' %C6 %92 U+00C3 0xC3 à Ã %C3 %83
U+201E 0x84 „ %E2 %80 %9E U+00C4 0xC4 Ä Ã„ %C3 %84
U+2026 0x85 … %E2 %80 %A6 U+00C5 0xC5 ŠÅ %C3 %85
U+2020 0x86 †%E2 %80 %A0 U+00C6 0xC6 Æ Ã† %C3 %86
U+2021 0x87 ‡ %E2 %80 %A1 U+00C7 0xC7 Ç Ã‡ %C3 %87
U+02C6 0x88 ˆ ˆ %CB %86 U+00C8 0xC8 È Ãˆ %C3 %88
U+2030 0x89 ‰ %E2 %80 %B0 U+00C9 0xC9 É Ã‰ %C3 %89
U+0160 0x8A Š Å %C5 %A0 U+00CA 0xCA Ê ÃŠ %C3 %8A
U+2039 0x8B ‹ %E2 %80 %B9 U+00CB 0xCB Ë Ã‹ %C3 %8B
U+0152 0x8C Œ Å' %C5 %92 U+00CC 0xCC Ì ÃŒ %C3 %8C
0x8D U+00CD 0xCD Í Ã %C3 %8D
U+017D 0x8E Ž Ž %C5 %BD U+00CE 0xCE Î ÃŽ %C3 %8E
0x8F U+00CF 0xCF Ï Ã %C3 %8F
0x90 U+00D0 0xD0 Ð Ã %C3 %90
U+2018 0x91 ' ‘ %E2 %80 %98 U+00D1 0xD1 Ñ Ã' %C3 %91
U+2019 0x92 ' ’ %E2 %80 %99 U+00D2 0xD2 Ò Ã' %C3 %92
U+201C 0x93 " “ %E2 %80 %9C U+00D3 0xD3 Ó Ã" %C3 %93
U+201D 0x94 " †%E2 %80 %9D U+00D4 0xD4 Ô Ã" %C3 %94
U+2022 0x95 • %E2 %80 %A2 U+00D5 0xD5 Õ Ã• %C3 %95
U+2013 0x96 â€" %E2 %80 %93 U+00D6 0xD6 Ö Ã– %C3 %96
U+2014 0x97 â€" %E2 %80 %94 U+00D7 0xD7 × Ã— %C3 %97
U+02DC 0x98 ˜ Ëœ %CB %9C U+00D8 0xD8 Ø Ã˜ %C3 %98
U+2122 0x99 ™ %E2 %84 %A2 U+00D9 0xD9 ٠Ù %C3 %99
U+0161 0x9A š Å¡ %C5 %A1 U+00DA 0xDA Ú Ãš %C3 %9A
U+203A 0x9B › %E2 %80 %BA U+00DB 0xDB Û Ã› %C3 %9B
U+0153 0x9C œ Å" %C5 %93 U+00DC 0xDC Ü Ãœ %C3 %9C
0x9D U+00DD 0xDD Ý Ã %C3 %9D
U+017E 0x9E ž ž %C5 %BE U+00DE 0xDE Þ Ãž %C3 %9E
U+0178 0x9F Ÿ Ÿ %C5 %B8 U+00DF 0xDF ß ÃŸ %C3 %9F
U+00A0 0xA0 Â %C2 %A0 U+00E0 0xE0 à Ã %C3 %A0
U+00A1 0xA1 ¡ ¡ %C2 %A1 U+00E1 0xE1 á á %C3 %A1
U+00A2 0xA2 ¢ ¢ %C2 %A2 U+00E2 0xE2 â â %C3 %A2
U+00A3 0xA3 £ £ %C2 %A3 U+00E3 0xE3 ã ã %C3 %A3
U+00A4 0xA4 ¤ ¤ %C2 %A4 U+00E4 0xE4 ä ä %C3 %A4
U+00A5 0xA5 ¥ Â¥ %C2 %A5 U+00E5 0xE5 å Ã¥ %C3 %A5
U+00A6 0xA6 ¦ ¦ %C2 %A6 U+00E6 0xE6 æ æ %C3 %A6
U+00A7 0xA7 § § %C2 %A7 U+00E7 0xE7 ç ç %C3 %A7
U+00A8 0xA8 ¨ ¨ %C2 %A8 U+00E8 0xE8 è è %C3 %A8
U+00A9 0xA9 © © %C2 %A9 U+00E9 0xE9 é é %C3 %A9
U+00AA 0xAA ª ª %C2 %AA U+00EA 0xEA ê ê %C3 %AA
U+00AB 0xAB « « %C2 %AB U+00EB 0xEB ë ë %C3 %AB
U+00AC 0xAC ¬ ¬ %C2 %AC U+00EC 0xEC ì ì %C3 %AC
U+00AD 0xAD ­ ­ %C2 %AD U+00ED 0xED í í %C3 %AD
U+00AE 0xAE ® ® %C2 %AE U+00EE 0xEE î î %C3 %AE
U+00AF 0xAF ¯ ¯ %C2 %AF U+00EF 0xEF ï ï %C3 %AF
U+00B0 0xB0 ° ° %C2 %B0 U+00F0 0xF0 ð ð %C3 %B0
U+00B1 0xB1 ± ± %C2 %B1 U+00F1 0xF1 ñ ñ %C3 %B1
U+00B2 0xB2 ² ² %C2 %B2 U+00F2 0xF2 ò ò %C3 %B2
U+00B3 0xB3 ³ ³ %C2 %B3 U+00F3 0xF3 ó ó %C3 %B3
U+00B4 0xB4 ´ ´ %C2 %B4 U+00F4 0xF4 ô ô %C3 %B4
U+00B5 0xB5 µ µ %C2 %B5 U+00F5 0xF5 õ õ %C3 %B5
U+00B6 0xB6 ¶ %C2 %B6 U+00F6 0xF6 ö ö %C3 %B6
U+00B7 0xB7 · · %C2 %B7 U+00F7 0xF7 ÷ ÷ %C3 %B7
U+00B8 0xB8 ¸ ¸ %C2 %B8 U+00F8 0xF8 ø ø %C3 %B8
U+00B9 0xB9 ¹ ¹ %C2 %B9 U+00F9 0xF9 ù ù %C3 %B9
U+00BA 0xBA º º %C2 %BA U+00FA 0xFA ú ú %C3 %BA
U+00BB 0xBB » » %C2 %BB U+00FB 0xFB û û %C3 %BB
U+00BC 0xBC ¼ ¼ %C2 %BC U+00FC 0xFC ü ü %C3 %BC
U+00BD 0xBD ½ ½ %C2 %BD U+00FD 0xFD ý ý %C3 %BD
U+00BE 0xBE ¾ ¾ %C2 %BE U+00FE 0xFE þ þ %C3 %BE
U+00BF 0xBF ¿ ¿ %C2 %BF U+00FF 0xFF ÿ ÿ %C3 %BF

Copyright © 2011 Tex Texin. All rights reserved.
return to top

Source: https://i18nqa.com/debug/utf8-debug.html

Posted by: replaceasingle.blogspot.com