Quick Answer
What are HTML character sets?
HTML character sets define how characters are encoded in web
documents. UTF-8 is the recommended standard encoding that
supports all languages and special characters. You specify it with
<meta charset="UTF-8"> in your HTML head. Other encodings
include ISO-8859-1 (Latin) and ASCII, but UTF-8 is universally
preferred.
Recommended: UTF-8
UTF-8 is the recommended character encoding for HTML documents. It
covers almost all characters and symbols in the world and is the
most widely used character encoding on the web. Always use UTF-8
for new documents.
<meta charset="UTF-8">
| Character Set |
Description |
| UTF-8 |
Universal character set. Covers all characters and symbols
in the world. Recommended for all HTML documents.
|
| UTF-16 |
Unicode Transformation Format 16-bit. Used by Windows and
Java systems.
|
| UTF-32 |
Unicode Transformation Format 32-bit. Fixed-width encoding
using 32 bits per character.
|
| ISO-8859-1 |
Latin alphabet No. 1. Covers Western European languages.
Also known as Latin-1.
|
| Windows-1252 |
Windows Latin-1. Similar to ISO-8859-1 but with additional
characters.
|
| ASCII |
American Standard Code for Information Interchange. 7-bit
character set covering basic English characters.
|
| Character Set |
Description |
| ISO-8859-1 |
Latin alphabet No. 1 (Western European languages) |
| ISO-8859-2 |
Latin alphabet No. 2 (Central and Eastern European
languages)
|
| ISO-8859-3 |
Latin alphabet No. 3 (South European languages) |
| ISO-8859-4 |
Latin alphabet No. 4 (North European languages) |
| ISO-8859-5 |
Latin/Cyrillic alphabet |
| ISO-8859-6 |
Latin/Arabic alphabet |
| ISO-8859-7 |
Latin/Greek alphabet |
| ISO-8859-8 |
Latin/Hebrew alphabet |
| ISO-8859-9 |
Latin alphabet No. 5 (Turkish) |
| ISO-8859-10 |
Latin alphabet No. 6 (Nordic languages) |
| ISO-8859-15 |
Latin alphabet No. 9 (Western European languages with Euro
sign)
|
| Character Set |
Description |
| Windows-1250 |
Central and Eastern European languages |
| Windows-1251 |
Cyrillic languages |
| Windows-1252 |
Western European languages |
| Windows-1253 |
Greek |
| Windows-1254 |
Turkish |
| Windows-1255 |
Hebrew |
| Windows-1256 |
Arabic |
| Windows-1257 |
Baltic languages |
| Windows-1258 |
Vietnamese |
| Character Set |
Description |
| GB2312 |
Simplified Chinese characters |
| Big5 |
Traditional Chinese characters |
| Shift_JIS |
Japanese characters |
| EUC-JP |
Japanese characters (Extended Unix Code) |
| EUC-KR |
Korean characters |
| KOI8-R |
Russian Cyrillic |
| TIS-620 |
Thai characters |