ISO-8859-1
ISO-8859-1 is also known as Latin-1 and Basic Latin. The first 128 characters in the code match the ASCII code. These codes are also in the Windows-1252 character set.
Contents |
[edit] Character set
In English Windows OS, the characters from ISO-8859-1 can be inserted by holding down the Alt key and entering a zero followed by the character's three-digit decimal code on the numpad.
Special characters should all be translated into their appropriate ISO LATIN code equivalent – either the numeric code, or the entity reference code. For instance, any ampersands need to be converted to “&” throughout a book. These codes are also supported under UTF-8 using U+00A0 to U+00FF which are the hexadecimal values for these code. This table includes a few ASCII characters that can be problematic to enter directly due to their specialized use in HTML but, while not shown, all ASCII characters are supported.
| Number Code | Hex Code | Word Code | Description | Character |
|---|---|---|---|---|
| " | x22 | " | quotation mark | " |
| & | x26 | & | ampersand | & |
| ' | x27 | ' | apostrophe[1] | ' |
| < | x3C | < | less-than sign | < |
| > | x3E | > | greater-than sign | > |
| ~ | x7E | ∼ | large Tilde | ~ |
| € to Ÿ are not defined in this character set, see Windows-1252 codes; | ||||
|   | xA0 | | non-breaking space[2] | |
| ¡ | xA1 | ¡ | inverted exclamation | ¡ |
| ¢ | xA2 | ¢ | cent sign | ¢ |
| £ | xA3 | £ | pound sterling | £ |
| ¤ | xA4 | ¤ | Currency | ¤ |
| ¥ | xA5 | ¥ | yen sign | ¥ |
| ¦ | xA6 | ¦ | broken vertical bar | ¦ |
| § | xA7 | § | section sign | § |
| ¨ | xA8 | ¨ | umlaut (dieresis) | ¨ |
| © | xA9 | © | copyright | © |
| ª | xAA | ª | feminine ordinal | ª |
| « | xAB | « | left angle quote | « |
| ¬ | xAC | ¬ | not sign | ¬ |
| ­ | xAD | ­ | soft hyphen[3] | |
| ® | xAE | ® | registered trademark | ® |
| ¯ | xAF | ¯ | macron accent | ¯ |
| ° | xB0 | ° | degree sign | ° |
| ± | xB1 | ± | plus or minus | ± |
| ² | xB2 | ² | Superscript two | ² |
| ³ | xB3 | ³ | Superscript three | ³ |
| ´ | xB4 | ´ | acute accent | ´ |
| µ | xB5 | µ | micro sign | µ |
| ¶ | xB6 | ¶ | paragraph sign | ¶ |
| · | xB7 | · | middle dot | · |
| ¸ | xB8 | ¸ | cedilla | ¸ |
| ¹ | xB9 | ¹ | Superscript one | ¹ |
| º | xBA | º | masculine ordinal | º |
| » | xBB | » | right angle quote | » |
| ¼ | xBC | ¼ | one fourth | ¼ |
| ½ | xBD | ½ | one half | ½ |
| ¾ | xBE | ¾ | three fourths | ¾ |
| ¿ | xBF | ¿ | inverted question mark | ¿ |
| À | xC0 | À | capital A, grave accent | À |
| Á | xC1 | Á | capital A, acute accent | Á |
| Â | xC2 | Â | capital A, circumflex accent | Â |
| Ã | xC3 | Ã | capital A, tilde | Ã |
| Ä | xC4 | Ä | capital A, dieresis or umlaut mark | Ä |
| Å | xC5 | Å | capital A, ring | Å |
| Æ | xC6 | Æ | capital AE diphthong (ligature) | Æ |
| Ç | xC7 | Ç | capital C, cedilla | Ç |
| È | xC8 | È | capital E, grave accent | È |
| É | xC9 | É | capital E, acute accent | É |
| Ê | xCA | Ê | capital E, circumflex accent | Ê |
| Ë | xCB | Ë | capital E, dieresis or umlaut mark | Ë |
| Ì | xCC | Ì | capital I, grave accent | Ì |
| Í | xCD | Í | capital I, acute accent | Ì |
| Î | xCE | Î | capital I, circumflex accent | Î |
| Ï | xCF | Ï | capital I, dieresis or umlaut mark | Ï |
| Ð | xD0 | Ð | capital ETH | Ð |
| Ñ | xD1 | Ñ | capital N, tilde | Ñ |
| Ò | xD2 | Ò | capital O, grave accent | Ò |
| Ó | xD3 | Ó | capital O, acute accent | Ó |
| Ô | xD4 | Ô | capital O, circumflex accent | Ô |
| Õ | xD5 | Õ | &capital O, tilde | Õ |
| Ö | xD6 | Ö | &capital O, dieresis or umlaut mark | Ö |
| × | xD7 | × | multiply sign | × |
| Ø | xD8 | Ø | capital O, slash | Ø |
| Ù | xD9 | Ù | capital U, grave accent | Ù |
| Ú | xDA | Ú | capital U, acute accent | Ú |
| Û | xDB | Û | capital U, circumflex accent | Û |
| Ü | xDC | Ü | capital U, dieresis or umlaut mark | Ü |
| Ý | xDD | Ý | capital Y, acute accent | Ý |
| Þ | xDE | Þ | capital THORN | Þ |
| ß | xDF | ß | small sharp s, German (sz ligature) | ß |
| à | xE0 | à | small a, grave accent | à |
| á | xE1 | á | small a, acute accent | á |
| â | xE2 | â | small a, circumflex accent | â |
| ã | xE3 | ã | small a, tilde | ã |
| ä | xE4 | ä | small a, dieresis or umlaut mark | ä |
| å | xE5 | å | small a, ring | å |
| æ | xE6 | æ | small ae diphthong (ligature) | æ |
| ç | xE7 | ç | small c, cedilla | ç |
| è | xE8 | è | small e, grave accent | è |
| é | xE9 | é | small e, acute accent | é |
| ê | xEA | ê | small e, circumflex accent | ê |
| ë | xEB | ë | small e, dieresis or umlaut mark | ë |
| ì | xEC | ì | small i, grave accent | ì |
| í | xED | í | small i, acute accent | í |
| î | xEE | î | small i, circumflex accent | î |
| ï | xEF | ï | small i, dieresis or umlaut mark | ï |
| ð | xF0 | ð | small eth | ð |
| ñ | xF1 | ñ | small n, tilde | ñ |
| ò | xF2 | ò | small o, grave accent | ò |
| ó | xF3 | ó | small o, acute accent | ó |
| ô | xF4 | ô | small o, circumflex accent | ô |
| õ | xF5 | õ | small o, tilde | õ |
| ö | xF6 | ö | small o, dieresis or umlaut mark | ö |
| ÷ | xF7 | ÷ | division sign | ÷ |
| ø | xF8 | ø | small o, slash | ø |
| ù | xF9 | ù | small u, grave accent | ù |
| ú | xFA | ú | small u, acute accent | ú |
| û | xFB | û | small u, circumflex accent | û |
| ü | xFC | ü | small u, dieresis or umlaut mark | ü |
| ý | xFD | ý | small y, acute accent | ý |
| þ | xFE | þ | small thorn | þ |
| ÿ | xFF | ÿ | small y, dieresis or umlaut mark | ÿ |
* While most of the symbols in the table are for communication there are a few that are specifically typographic.
- ↑ the apostrophe is defined as an XML character with the mnemonic apos but this is not recognized by most browsers. Use the numeric 39 instead.
- ↑ The non-breaking space is used as a forcing space. A line cannot terminate on a forcing space so it serves to force two words to behave as one.
- ↑ The soft hyphen marks the location of a hyphen point. It can be used to hyphenate a word if it appears near the end of a line. Otherwise the character should be ignored and not printed.
[edit] Unsupported codes
Not all devices or implementations of this standard support all of the characters defined in the standard. Here are some of the exceptions:
- Gemstar devices: no support for 164, 178, 179, 185, 188, 189, 190, 208, 222, 240, 254.
- ETI devices: no support for 164, 188, 189, 190, 208, 222, 240, 254
- MobiPocket and Amazon Kindle devices: 126 mapped to – instead of Tilde.
- PML supports all codes.
- ePub supports exactly this set of codes encoded as UTF-8. Barnes and Noble, for example, only supports this set without embedding a custom font.
[edit] Entering the characters
In the English version of Windows, the characters from Windows-1252 and ISO-8859-1 can be inserted by holding down the Alt key and entering a zero followed by the character's three-digit decimal code on the numpad.
[edit] Coverage
- Modern languages with complete coverage of their alphabet
|
|
|
[edit] ISO-8859-15
The ISO-8859-15 standard (also known as "Latin alphabet no. 9" or simply Latin-9) is designed to address some of the shortcomings in the original ISO-8859-1 standard. The idea is to fully support a few more western languages and add the Euro symbol by replacing some little used symbols in 8859-1. These symbols were previously included in the Windows-1252 character set.
[edit] Changes from ISO-8859-1
| Position | 164 | 166 | 168 | 180 | 184 | 188 | 189 | 190 |
|---|---|---|---|---|---|---|---|---|
| 8859-1 | ¤ | ¦ | ¨ | ´ | ¸ | ¼ | ½ | ¾ |
| 8859-15 | € | Š | š | Ž | ž | Œ | œ | Ÿ |
€ became necessary when the Euro was introduced. The rest were excluded from ISO 8859-1 because it was motivated by information exchange and not typography. Š, š, Ž, and ž are used in some loanwords and transliteration of Russian names in Finnish and Estonian typography. Œ and œ are French ligatures, and Ÿ is needed in French all-caps text, as it is present in a few proper names such as the city of l'Haÿ-les-Roses.
[edit] Extra Languages covered
- Estonian
- Dutch (minus the IJ, ij ligatures)
- Finnish
- French
- Malay
- Tagalog