ISO-8859-1

From MobileRead
Jump to: navigation, search

ISO-8859-1 is also known as Latin-1 and Basic Latin. The first 128 characters in the code match the ASCII code. These codes are also in the Windows-1252 character set.

Contents

[edit] Character set

In English Windows OS, the characters from ISO-8859-1 can be inserted by holding down the Alt key and entering a zero followed by the character's three-digit decimal code on the numpad.

Special characters should all be translated into their appropriate ISO LATIN code equivalent – either the numeric code, or the entity reference code. For instance, any ampersands need to be converted to “&” throughout a book. These codes are also supported under UTF-8 using U+00A0 to U+00FF which are the hexadecimal values for these code. This table includes a few ASCII characters that can be problematic to enter directly due to their specialized use in HTML but, while not shown, all ASCII characters are supported.

Number
Code
Hex
Code
Word CodeDescriptionCharacter
"x22" quotation mark "
&x26&ampersand&
'x27'apostrophe[1]'
&#60;x3C &lt;less-than sign <
&#62;x3E&gt;greater-than sign >
&#126;x7E&Tilde;large Tilde ~
&#128; to &#159; are not defined in this character set, see Windows-1252 codes;
&#160;xA0 &nbsp;non-breaking space[2] 
&#161;xA1&iexcl;inverted exclamation¡
&#162;xA2&cent; cent sign ¢
&#163;xA3 &pound; pound sterling £
&#164;xA4&curren;Currency¤
&#165;xA5 &yen; yen sign ¥
&#166;xA6 &brvbar; broken vertical bar ¦
&#167;xA7 &sect; section sign §
&#168;xA8 &uml; umlaut (dieresis) ¨
&#169;xA9 &copy; copyright ©
&#170;xAA &ordf; feminine ordinal ª
&#171;xAB &laquo; left angle quote «
&#172;xAC &not; not sign ¬
&#173;xAD &shy; soft hyphen[3] 
&#174;xAE &reg; registered trademark®
&#175;xAF &macr; macron accent¯
&#176;xB0 &deg; degree sign °
&#177;xB1 &plusmn; plus or minus±
&#178;xB2&sup2;Superscript two²
&#179;xB3&sup3;Superscript three³
&#180;xB4 &acute; acute accent ´
&#181;xB5&micro; micro signµ
&#182;xB6&para; paragraph sign
&#183;xB7&middot; middle dot·
&#184;xB8 &cedil; cedilla¸
&#185;xB9&sup1;Superscript one¹
&#186;xBA &ordm; masculine ordinalº
&#187;xBB&raquo; right angle quote»
&#188;xBC&frac14;one fourth¼
&#189;xBD&frac12;one half½
&#190;xBE&frac34three fourths¾
&#191;xBF&iquest;inverted question mark¿
&#192; xC0&Agrave;capital A, grave accentÀ
&#193;xC1 &Aacute;capital A, acute accentÁ
&#194;xC2 &Acirc;capital A, circumflex accent Â
&#195;xC3 &Atilde;capital A, tilde Ã
&#196;xC4 &Auml; capital A, dieresis or umlaut markÄ
&#197; xC5&Aring; capital A, ring Å
&#198;xC6 &AElig; capital AE diphthong (ligature)Æ
&#199;xC7 &Ccedil;capital C, cedilla Ç
&#200; xC8&Egrave;capital E, grave accent È
&#201;xC9 &Eacute;capital E, acute accent É
&#202;xCA&Ecirc; capital E, circumflex accentÊ
&#203; xCB&Euml; capital E, dieresis or umlaut mark Ë
&#204; xCC&Igrave; capital I, grave accent Ì
&#205; xCD&Iacute; capital I, acute accent Ì
&#206;xCE&Icirc; capital I, circumflex accentÎ
&#207; xCF&Iuml; capital I, dieresis or umlaut mark Ï
&#208; xD0&ETH; capital ETH Ð
&#209; xD1&Ntilde; capital N, tilde Ñ
&#210;xD2&Ograve; capital O, grave accentÒ
&#211; xD3&Oacute; capital O, acute accent Ó
&#212; xD4&Ocirc; capital O, circumflex accent Ô
&#213;xD5 &Otilde; &capital O, tildeÕ
&#214; xD6&Ouml; &capital O, dieresis or umlaut markÖ
&#215;xD7 &times; multiply sign ×
&#216;xD8&Oslash; capital O, slash Ø
&#217; xD9&Ugrave; capital U, grave accent Ù
&#218;xDA&Uacute; capital U, acute accentÚ
&#219; xDB&Ucirc; capital U, circumflex accent Û
&#220; xDC&Uuml; capital U, dieresis or umlaut mark Ü
&#221;xDD &Yacute; capital Y, acute accentÝ
&#222;xDE&THORN;capital THORNÞ
&#223;xDF &szlig; small sharp s, German (sz ligature)ß
&#224;xE0 &agrave; small a, grave accentà
&#225;xE1 &aacute; small a, acute accentá
&#226;xE2 &acirc; small a, circumflex accentâ
&#227;xE3 &atilde; small a, tildeã
&#228;xE4 &auml; small a, dieresis or umlaut markä
&#229;xE5 &aring; small a, ringå
&#230; xE6&aelig; small ae diphthong (ligature)æ
&#231;xE7 &ccedil; small c, cedillaç
&#232;xE8 &egrave; small e, grave accentè
&#233;xE9 &eacute; small e, acute accenté
&#234; xEA&ecirc; small e, circumflex accentê
&#235;xEB &euml; small e, dieresis or umlaut mark ë
&#236; xEC&igrave; small i, grave accent ì
&#237;xED &iacute; small i, acute accentí
&#238; xEE&icirc; small i, circumflex accentî
&#239;xEF &iuml; small i, dieresis or umlaut markï
&#240;xF0 &eth; small eth ð
&#241;xF1 &ntilde; small n, tilde ñ
&#242;xF2 &ograve; small o, grave accentò
&#243;xF3 &oacute; small o, acute accentó
&#244;xF4 &ocirc; small o, circumflex accentô
&#245;xF5 &otilde; small o, tildeõ
&#246;xF6 &ouml; small o, dieresis or umlaut markö
&#247; xF7&divide; division sign ÷
&#248;xF8 &oslash; small o, slash ø
&#249;xF9 &ugrave; small u, grave accentù
&#250; xFA&uacute; small u, acute accent ú
&#251;xFB &ucirc; small u, circumflex accentû
&#252; xFC&uuml; small u, dieresis or umlaut markü
&#253; xFD&yacute; small y, acute accentý
&#254; xFE&thorn; small thornþ
&#255; xFF&yuml; small y, dieresis or umlaut markÿ

* While most of the symbols in the table are for communication there are a few that are specifically typographic.

  1. the apostrophe is defined as an XML character with the mnemonic apos but this is not recognized by most browsers. Use the numeric 39 instead.
  2. The non-breaking space is used as a forcing space. A line cannot terminate on a forcing space so it serves to force two words to behave as one.
  3. The soft hyphen marks the location of a hyphen point. It can be used to hyphenate a word if it appears near the end of a line. Otherwise the character should be ignored and not printed.

[edit] Unsupported codes

Not all devices or implementations of this standard support all of the characters defined in the standard. Here are some of the exceptions:

  • Gemstar devices: no support for 164, 178, 179, 185, 188, 189, 190, 208, 222, 240, 254.
  • ETI devices: no support for 164, 188, 189, 190, 208, 222, 240, 254
  • MobiPocket and Amazon Kindle devices: 126 mapped to – instead of Tilde.
  • PML supports all codes.
  • ePub supports exactly this set of codes encoded as UTF-8. Barnes and Noble, for example, only supports this set without embedding a custom font.

[edit] Entering the characters

In the English version of Windows, the characters from Windows-1252 and ISO-8859-1 can be inserted by holding down the Alt key and entering a zero followed by the character's three-digit decimal code on the numpad.

[edit] Coverage

Modern languages with complete coverage of their alphabet
  • Afrikaans
  • Albanian
  • Breton
  • Danish
  • English (US and modern British)
  • Faroese
  • Galician
  • German
  • Icelandic
  • Irish (new orthography)
  • Italian
  • Latin (basic classical orthography)
  • Luxembourgish (basic classical orthography)
  • Norwegian (Bokmål and Nynorsk)
  • Occitan
  • Portuguese (European and Brazilian)
  • Rhaeto-Romanic
  • Scottish Gaelic
  • Spanish
  • Swahili
  • Swedish
  • Walloon
  • Basque

[edit] ISO-8859-15

The ISO-8859-15 standard (also known as "Latin alphabet no. 9" or simply Latin-9) is designed to address some of the shortcomings in the original ISO-8859-1 standard. The idea is to fully support a few more western languages and add the Euro symbol by replacing some little used symbols in 8859-1. These symbols were previously included in the Windows-1252 character set.

[edit] Changes from ISO-8859-1

Position 164 166 168 180 184 188 189 190
8859-1 ¤ ¦ ¨ ´ ¸ ¼ ½ ¾
8859-15 Š š Ž ž Œ œ Ÿ

€ became necessary when the Euro was introduced. The rest were excluded from ISO 8859-1 because it was motivated by information exchange and not typography. Š, š, Ž, and ž are used in some loanwords and transliteration of Russian names in Finnish and Estonian typography. Œ and œ are French ligatures, and Ÿ is needed in French all-caps text, as it is present in a few proper names such as the city of l'Haÿ-les-Roses.

[edit] Extra Languages covered

  • Estonian
  • Dutch (minus the IJ, ij ligatures)
  • Finnish
  • French
  • Malay
  • Tagalog
Personal tools
Namespaces

Variants
Actions
Navigation
MobileRead Networks
Toolbox