HTML Encoding (Character Sets)
To display an HTML page correctly, a web browser must know which character set to use.
The HTML charset Attribute
The character set is specified in the <meta>
tag:
Example
<meta charset="UTF-8">
The HTML5 specification encourages web developers to use the UTF-8 character set.
UTF-8 covers almost all of the characters and symbols in the world!
The ASCII Character Set
ASCII was the first character encoding standard for the web. It defined 128 different characters that could be used on the internet:
- English letters (A-Z)
- Numbers (0-9)
- Special characters like ! $ + - ( ) @ < >.
The ANSI Character Set
ANSI (Windows-1252) was the original Windows character set:
- Identical to ASCII for the first 127 characters
- Special characters from 128 to 159
- Identical to UTF-8 from 160 to 255
<meta charset="Windows-1252">
The ISO-8859-1 Character Set
ISO-8859-1 was the default character set for HTML 4. This character set supported 256 different character codes. HTML 4 also supported UTF-8.
- Identical to ASCII for the first 127 characters
- Does not use the characters from 128 to 159
- Identical to ANSI and UTF-8 from 160 to 255
HTML 4 Example
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
HTML 5 Example
<meta charset="ISO-8859-1">
The UTF-8 Character Set
- is identical to ASCII for the values from 0 to 127
- Does not use the characters from 128 to 159
- Identical to ANSI and 8859-1 from 160 to 255
- Continues from the value 256 to 10 000 characters
<meta charset="UTF-8">
Full HTML Character Set Reference.
Differences Between Character Sets
The following table displays the differences between the character sets described above:
Numb | ASCII | ANSI | 8859 | UTF‑8 | Description |
---|---|---|---|---|---|
32 | space | ||||
33 | ! | ! | ! | ! | exclamation mark |
34 | " | " | " | " | quotation mark |
35 | # | # | # | # | number sign |
36 | $ | $ | $ | $ | dollar sign |
37 | % | % | % | % | percent sign |
38 | & | & | & | & | ampersand |
39 | ' | ' | ' | ' | apostrophe |
40 | ( | ( | ( | ( | left parenthesis |
41 | ) | ) | ) | ) | right parenthesis |
42 | * | * | * | * | asterisk |
43 | + | + | + | + | plus sign |
44 | , | , | , | , | comma |
45 | - | - | - | - | hyphen-minus |
46 | . | . | . | . | full stop |
47 | / | / | / | / | solidus |
48 | 0 | 0 | 0 | 0 | digit zero |
49 | 1 | 1 | 1 | 1 | digit one |
50 | 2 | 2 | 2 | 2 | digit two |
51 | 3 | 3 | 3 | 3 | digit three |
52 | 4 | 4 | 4 | 4 | digit four |
53 | 5 | 5 | 5 | 5 | digit five |
54 | 6 | 6 | 6 | 6 | digit six |
55 | 7 | 7 | 7 | 7 | digit seven |
56 | 8 | 8 | 8 | 8 | digit eight |
57 | 9 | 9 | 9 | 9 | digit nine |
58 | : | : | : | : | colon |
59 | ; | ; | ; | ; | semicolon |
60 | < | < | < | < | less than |
61 | = | = | = | = | equals sign |
62 | > | > | > | > | greater than |
63 | ? | ? | ? | ? | question mark |
64 | @ | @ | @ | @ | commercial at |
65 | A | A | A | A | Latin A |
66 | B | B | B | B | Latin B |
67 | C | C | C | C | Latin C |
68 | D | D | D | D | Latin D |
69 | E | E | E | E | Latin E |
70 | F | F | F | F | Latin F |
71 | G | G | G | G | Latin G |
72 | H | H | H | H | Latin H |
73 | I | I | I | I | Latin I |
74 | J | J | J | J | Latin J |
75 | K | K | K | K | Latin K |
76 | L | L | L | L | Latin L |
77 | M | M | M | M | Latin M |
78 | N | N | N | N | Latin N |
79 | O | O | O | O | Latin O |
80 | P | P | P | P | Latin P |
81 | Q | Q | Q | Q | Latin Q |
82 | R | R | R | R | Latin R |
83 | S | S | S | S | Latin S |
84 | T | T | T | T | Latin T |
85 | U | U | U | U | Latin U |
86 | V | V | V | V | Latin V |
87 | W | W | W | W | Latin W |
88 | X | X | X | X | Latin X |
89 | Y | Y | Y | Y | Latin Y |
90 | Z | Z | Z | Z | Latin Z |
91 | [ | [ | [ | [ | left square bracket |
92 | \ | \ | \ | \ | reverse solidus |
93 | ] | ] | ] | ] | right square bracket |
94 | ^ | ^ | ^ | ^ | circumflex accent |
95 | _ | _ | _ | _ | low line |
96 | ` | ` | ` | ` | grave accent |
97 | a | a | a | a | Latin small a |
98 | b | b | b | b | Latin small b |
99 | c | c | c | c | Latin small c |
100 | d | d | d | d | Latin small d |
101 | e | e | e | e | Latin small e |
102 | f | f | f | f | Latin small f |
103 | g | g | g | g | Latin small g |
104 | h | h | h | h | Latin small h |
105 | i | i | i | i | Latin small i |
106 | j | j | j | j | Latin small j |
107 | k | k | k | k | Latin small k |
108 | l | l | l | l | Latin small l |
109 | m | m | m | m | Latin small m |
110 | n | n | n | n | Latin small n |
111 | o | o | o | o | Latin small o |
112 | p | p | p | p | Latin small p |
113 | q | q | q | q | Latin small q |
114 | r | r | r | r | Latin small r |
115 | s | s | s | s | Latin small s |
116 | t | t | t | t | Latin small t |
117 | u | u | u | u | Latin small u |
118 | v | v | v | v | Latin small v |
119 | w | w | w | w | Latin small w |
120 | x | x | x | x | Latin small x |
121 | y | y | y | y | Latin small y |
122 | z | z | z | z | Latin small z |
123 | { | { | { | { | left curly bracket |
124 | | | | | | | | | vertical line |
125 | } | } | } | } | right curly bracket |
126 | ~ | ~ | ~ | ~ | tilde |
127 | DEL | ||||
128 | | euro sign | |||
129 | | | | NOT USED | |
130 | | single low-9 quotation mark | |||
131 | | Latin small f with hook | |||
132 | | double low-9 quotation mark | |||
133 | horizontal ellipsis | ||||
134 | | dagger | |||
135 | | double dagger | |||
136 | | modifier letter circumflex accent | |||
137 | | per mille sign | |||
138 | | Latin S with caron | |||
139 | | single left-pointing angle quotation mark | |||
140 | | Latin capital ligature OE | |||
141 | | | | NOT USED | |
142 | | Latin Z with caron | |||
143 | | | | NOT USED | |
144 | | | | NOT USED | |
145 | | left single quotation mark | |||
146 | | right single quotation mark | |||
147 | | left double quotation mark | |||
148 | | right double quotation mark | |||
149 | | bullet | |||
150 | | en dash | |||
151 | | em dash | |||
152 | | small tilde | |||
153 | | trade mark sign | |||
154 | | Latin small s with caron | |||
155 | | single right-pointing angle quotation mark | |||
156 | | Latin small ligature oe | |||
157 | | | | NOT USED | |
158 | | Latin small z with caron | |||
159 | | Latin Y with diaeresis | |||
160 | no-break space | ||||
161 | ¡ | ¡ | ¡ | inverted exclamation mark | |
162 | ¢ | ¢ | ¢ | cent sign | |
163 | £ | £ | £ | pound sign | |
164 | ¤ | ¤ | ¤ | currency sign | |
165 | ¥ | ¥ | ¥ | yen sign | |
166 | ¦ | ¦ | ¦ | broken bar | |
167 | § | § | § | section sign | |
168 | ¨ | ¨ | ¨ | diaeresis | |
169 | © | © | © | copyright sign | |
170 | ª | ª | ª | feminine ordinal indicator | |
171 | « | « | « | left-pointing double angle quotation mark | |
172 | ¬ | ¬ | ¬ | not sign | |
173 | | | | soft hyphen | |
174 | ® | ® | ® | registered sign | |
175 | ¯ | ¯ | ¯ | macron | |
176 | ° | ° | ° | degree sign | |
177 | ± | ± | ± | plus-minus sign | |
178 | ² | ² | ² | superscript two | |
179 | ³ | ³ | ³ | superscript three | |
180 | ´ | ´ | ´ | acute accent | |
181 | µ | µ | µ | micro sign | |
182 | ¶ | ¶ | ¶ | pilcrow sign | |
183 | · | · | · | middle dot | |
184 | ¸ | ¸ | ¸ | cedilla | |
185 | ¹ | ¹ | ¹ | superscript one | |
186 | º | º | º | masculine ordinal indicator | |
187 | » | » | » | right-pointing double angle quotation mark | |
188 | ¼ | ¼ | ¼ | vulgar fraction one quarter | |
189 | ½ | ½ | ½ | vulgar fraction one half | |
190 | ¾ | ¾ | ¾ | vulgar fraction three quarters | |
191 | ¿ | ¿ | ¿ | inverted question mark | |
192 | À | À | À | Latin A with grave | |
193 | Á | Á | Á | Latin A with acute | |
194 | Â | Â | Â | Latin A with circumflex | |
195 | Ã | Ã | Ã | Latin A with tilde | |
196 | Ä | Ä | Ä | Latin A with diaeresis | |
197 | Å | Å | Å | Latin A with ring above | |
198 | Æ | Æ | Æ | Latin AE | |
199 | Ç | Ç | Ç | Latin C with cedilla | |
200 | È | È | È | Latin E with grave | |
201 | É | É | É | Latin E with acute | |
202 | Ê | Ê | Ê | Latin E with circumflex | |
203 | Ë | Ë | Ë | Latin E with diaeresis | |
204 | Ì | Ì | Ì | Latin I with grave | |
205 | Í | Í | Í | Latin I with acute | |
206 | Î | Î | Î | Latin I with circumflex | |
207 | Ï | Ï | Ï | Latin I with diaeresis | |
208 | Ð | Ð | Ð | Latin Eth | |
209 | Ñ | Ñ | Ñ | Latin N with tilde | |
210 | Ò | Ò | Ò | Latin O with grave | |
211 | Ó | Ó | Ó | Latin O with acute | |
212 | Ô | Ô | Ô | Latin O with circumflex | |
213 | Õ | Õ | Õ | Latin O with tilde | |
214 | Ö | Ö | Ö | Latin O with diaeresis | |
215 | × | × | × | multiplication sign | |
216 | Ø | Ø | Ø | Latin O with stroke | |
217 | Ù | Ù | Ù | Latin U with grave | |
218 | Ú | Ú | Ú | Latin U with acute | |
219 | Û | Û | Û | Latin U with circumflex | |
220 | Ü | Ü | Ü | Latin U with diaeresis | |
221 | Ý | Ý | Ý | Latin Y with acute | |
222 | Þ | Þ | Þ | Latin Thorn | |
223 | ß | ß | ß | Latin small sharp s | |
224 | à | à | à | Latin small a with grave | |
225 | á | á | á | Latin small a with acute | |
226 | â | â | â | Latin small a with circumflex | |
227 | ã | ã | ã | Latin small a with tilde | |
228 | ä | ä | ä | Latin small a with diaeresis | |
229 | å | å | å | Latin small a with ring above | |
230 | æ | æ | æ | Latin small ae | |
231 | ç | ç | ç | Latin small c with cedilla | |
232 | è | è | è | Latin small e with grave | |
233 | é | é | é | Latin small e with acute | |
234 | ê | ê | ê | Latin small e with circumflex | |
235 | ë | ë | ë | Latin small e with diaeresis | |
236 | ì | ì | ì | Latin small i with grave | |
237 | í | í | í | Latin small i with acute | |
238 | î | î | î | Latin small i with circumflex | |
239 | ï | ï | ï | Latin small i with diaeresis | |
240 | ð | ð | ð | Latin small eth | |
241 | ñ | ñ | ñ | Latin small n with tilde | |
242 | ò | ò | ò | Latin small o with grave | |
243 | ó | ó | ó | Latin small o with acute | |
244 | ô | ô | ô | Latin small o with circumflex | |
245 | õ | õ | õ | Latin small o with tilde | |
246 | ö | ö | ö | Latin small o with diaeresis | |
247 | ÷ | ÷ | ÷ | division sign | |
248 | ø | ø | ø | Latin small o with stroke | |
249 | ù | ù | ù | Latin small u with grave | |
250 | ú | ú | ú | Latin small u with acute | |
251 | û | û | û | Latin small with circumflex | |
252 | ü | ü | ü | Latin small u with diaeresis | |
253 | ý | ý | ý | Latin small y with acute | |
254 | þ | þ | þ | Latin small thorn | |
255 | ÿ | ÿ | ÿ | Latin small y with diaeresis |