ISO-8859-1
ISO-8859-1 is also known as Latin-1 and Basic Latin. The first 128 characters in the code match the ASCII code. These codes are also in the Windows-1252 character set.
Contents |
[edit] Character set
In English Windows OS, the characters from ISO-8859-1 can be inserted by holding down the Alt key and entering a zero followed by the character's three-digit decimal code on the numpad.
Special characters should all be translated into their appropriate ISO LATIN code equivalent – either the numeric code, or the entity reference code. For instance, any ampersands need to be converted to “&” throughout a book. These codes are also supported under UTF-8 using U+00A0 to U+00FF which are the hexadecimal values for these code. This table includes a few ASCII characters that can be problematic to enter directly due to their specialized use in HTML but, while not shown, all ASCII characters are supported.
Number Code | Hex Code | Word Code | Description | Character |
---|---|---|---|---|
" | x22 | " | quotation mark | " |
& | x26 | & | ampersand | & |
' | x27 | ' | apostrophe[1] | ' |
< | x3C | < | less-than sign | < |
> | x3E | > | greater-than sign | > |
~ | x7E | ∼ | large Tilde | ~ |
€ to Ÿ are not defined in this character set, see Windows-1252 codes; | ||||
  | xA0 | | non-breaking space[2] | |
¡ | xA1 | ¡ | inverted exclamation | ¡ |
¢ | xA2 | ¢ | cent sign | ¢ |
£ | xA3 | £ | pound sterling | £ |
¤ | xA4 | ¤ | Currency | ¤ |
¥ | xA5 | ¥ | yen sign | ¥ |
¦ | xA6 | ¦ | broken vertical bar | ¦ |
§ | xA7 | § | section sign | § |
¨ | xA8 | ¨ | umlaut (dieresis) | ¨ |
© | xA9 | © | copyright | © |
ª | xAA | ª | feminine ordinal | ª |
« | xAB | « | left angle quote | « |
¬ | xAC | ¬ | not sign | ¬ |
­ | xAD | ­ | soft hyphen[3] | |
® | xAE | ® | registered trademark | ® |
¯ | xAF | ¯ | macron accent | ¯ |
° | xB0 | ° | degree sign | ° |
± | xB1 | ± | plus or minus | ± |
² | xB2 | ² | Superscript two | ² |
³ | xB3 | ³ | Superscript three | ³ |
´ | xB4 | ´ | acute accent | ´ |
µ | xB5 | µ | micro sign | µ |
¶ | xB6 | ¶ | paragraph sign | ¶ |
· | xB7 | · | middle dot | · |
¸ | xB8 | ¸ | cedilla | ¸ |
¹ | xB9 | ¹ | Superscript one | ¹ |
º | xBA | º | masculine ordinal | º |
» | xBB | » | right angle quote | » |
¼ | xBC | ¼ | one fourth | ¼ |
½ | xBD | ½ | one half | ½ |
¾ | xBE | ¾ | three fourths | ¾ |
¿ | xBF | ¿ | inverted question mark | ¿ |
À | xC0 | À | capital A, grave accent | À |
Á | xC1 | Á | capital A, acute accent | Á |
 | xC2 |  | capital A, circumflex accent |  |
à | xC3 | à | capital A, tilde | à |
Ä | xC4 | Ä | capital A, dieresis or umlaut mark | Ä |
Å | xC5 | Å | capital A, ring | Å |
Æ | xC6 | Æ | capital AE diphthong (ligature) | Æ |
Ç | xC7 | Ç | capital C, cedilla | Ç |
È | xC8 | È | capital E, grave accent | È |
É | xC9 | É | capital E, acute accent | É |
Ê | xCA | Ê | capital E, circumflex accent | Ê |
Ë | xCB | Ë | capital E, dieresis or umlaut mark | Ë |
Ì | xCC | Ì | capital I, grave accent | Ì |
Í | xCD | Í | capital I, acute accent | Ì |
Î | xCE | Î | capital I, circumflex accent | Î |
Ï | xCF | Ï | capital I, dieresis or umlaut mark | Ï |
Ð | xD0 | Ð | capital ETH | Ð |
Ñ | xD1 | Ñ | capital N, tilde | Ñ |
Ò | xD2 | Ò | capital O, grave accent | Ò |
Ó | xD3 | Ó | capital O, acute accent | Ó |
Ô | xD4 | Ô | capital O, circumflex accent | Ô |
Õ | xD5 | Õ | &capital O, tilde | Õ |
Ö | xD6 | Ö | &capital O, dieresis or umlaut mark | Ö |
× | xD7 | × | multiply sign | × |
Ø | xD8 | Ø | capital O, slash | Ø |
Ù | xD9 | Ù | capital U, grave accent | Ù |
Ú | xDA | Ú | capital U, acute accent | Ú |
Û | xDB | Û | capital U, circumflex accent | Û |
Ü | xDC | Ü | capital U, dieresis or umlaut mark | Ü |
Ý | xDD | Ý | capital Y, acute accent | Ý |
Þ | xDE | Þ | capital THORN | Þ |
ß | xDF | ß | small sharp s, German (sz ligature) | ß |
à | xE0 | à | small a, grave accent | à |
á | xE1 | á | small a, acute accent | á |
â | xE2 | â | small a, circumflex accent | â |
ã | xE3 | ã | small a, tilde | ã |
ä | xE4 | ä | small a, dieresis or umlaut mark | ä |
å | xE5 | å | small a, ring | å |
æ | xE6 | æ | small ae diphthong (ligature) | æ |
ç | xE7 | ç | small c, cedilla | ç |
è | xE8 | è | small e, grave accent | è |
é | xE9 | é | small e, acute accent | é |
ê | xEA | ê | small e, circumflex accent | ê |
ë | xEB | ë | small e, dieresis or umlaut mark | ë |
ì | xEC | ì | small i, grave accent | ì |
í | xED | í | small i, acute accent | í |
î | xEE | î | small i, circumflex accent | î |
ï | xEF | ï | small i, dieresis or umlaut mark | ï |
ð | xF0 | ð | small eth | ð |
ñ | xF1 | ñ | small n, tilde | ñ |
ò | xF2 | ò | small o, grave accent | ò |
ó | xF3 | ó | small o, acute accent | ó |
ô | xF4 | ô | small o, circumflex accent | ô |
õ | xF5 | õ | small o, tilde | õ |
ö | xF6 | ö | small o, dieresis or umlaut mark | ö |
÷ | xF7 | ÷ | division sign | ÷ |
ø | xF8 | ø | small o, slash | ø |
ù | xF9 | ù | small u, grave accent | ù |
ú | xFA | ú | small u, acute accent | ú |
û | xFB | û | small u, circumflex accent | û |
ü | xFC | ü | small u, dieresis or umlaut mark | ü |
ý | xFD | ý | small y, acute accent | ý |
þ | xFE | þ | small thorn | þ |
ÿ | xFF | ÿ | small y, dieresis or umlaut mark | ÿ |
* While most of the symbols in the table are for communication there are a few that are specifically typographic.
- ↑ the apostrophe is defined as an XML character with the mnemonic apos but this is not recognized by most browsers. Use the numeric 39 instead.
- ↑ The non-breaking space is used as a forcing space. A line cannot terminate on a forcing space so it serves to force two words to behave as one.
- ↑ The soft hyphen marks the location of a hyphen point. It can be used to hyphenate a word if it appears near the end of a line. Otherwise the character should be ignored and not printed.
[edit] Unsupported codes
Not all devices or implementations of this standard support all of the characters defined in the standard. Here are some of the exceptions:
- Gemstar devices: no support for 164, 178, 179, 185, 188, 189, 190, 208, 222, 240, 254.
- ETI devices: no support for 164, 188, 189, 190, 208, 222, 240, 254
- MobiPocket and Amazon Kindle devices: 126 mapped to – instead of Tilde.
- PML supports all codes.
- ePub supports exactly this set of codes encoded as UTF-8. Barnes and Noble, for example, only supports this set without embedding a custom font.
[edit] Entering the characters
In the English version of Windows, the characters from Windows-1252 and ISO-8859-1 can be inserted by holding down the Alt key and entering a zero followed by the character's three-digit decimal code on the numpad.
[edit] Coverage
- Modern languages with complete coverage of their alphabet
|
|
|
[edit] ISO-8859-15
The ISO-8859-15 standard (also known as "Latin alphabet no. 9" or simply Latin-9) is designed to address some of the shortcomings in the original ISO-8859-1 standard. The idea is to fully support a few more western languages and add the Euro symbol by replacing some little used symbols in 8859-1. These symbols were previously included in the Windows-1252 character set.
[edit] Changes from ISO-8859-1
Position | 164 | 166 | 168 | 180 | 184 | 188 | 189 | 190 |
---|---|---|---|---|---|---|---|---|
8859-1 | ¤ | ¦ | ¨ | ´ | ¸ | ¼ | ½ | ¾ |
8859-15 | € | Š | š | Ž | ž | Œ | œ | Ÿ |
€ became necessary when the Euro was introduced. The rest were excluded from ISO 8859-1 because it was motivated by information exchange and not typography. Š, š, Ž, and ž are used in some loanwords and transliteration of Russian names in Finnish and Estonian typography. Œ and œ are French ligatures, and Ÿ is needed in French all-caps text, as it is present in a few proper names such as the city of l'Haÿ-les-Roses.
[edit] Extra Languages covered
- Estonian
- Dutch (minus the IJ, ij ligatures)
- Finnish
- French
- Malay
- Tagalog