Alphabet
Alphabets are the building blocks of writing. The western (or Latin) alphabet can be seen under ASCII. The table in ISO 8859-5 shows the Cyrillic Alphabet Unicode set (also see Windows-1251). The characters can also be entered using their Named character references which was used to generate the content of this page.
Contents |
[edit] Overview
The alphabet differed from earlier writing in that it was designed to capture the phonic sounds of speech and transfer them to a written form. This method allow for a common alphabet to represent more than one language and required only a few symbols. Earlier writing used symbols that represented full words or syllables which required many more symbols to capture a language and likely needed new symbols every time a new word was needed. Since they were not based on sounds per se they were locked into one or only a few related languages. Today the Latin alphabet is likely the most popular alphabet in use. Modern usage has augmented the alphabet in some languages by using accent marks to tweak the sound being represented. Eastern Europe uses the Cyrillic alphabet to capture the sounds of the languages near Asia. Several other alphabets exist that are not shown here.
[edit] History
The oldest of the phonic alphabets, shown here, is the Hebrew which differs from later alphabets in that it does not have vowels, nor does it have lower case. The symbols shown here are from the modern Hebrew alphabet which differ completely from their earlier form and were taken from the Aramaic. The Greek followed later. It included vowels and eventually both cases although initially it only had upper case and did not have spaces between words. Today most western languages use the same or similar alphabet often called the western or Latin alphabet while the characters are called ASCII. The Latin alphabet is also used by some eastern nations.
The Latin alphabet was derived from an earlier Old Italian alphabet which was derived from the Greek. The original Latin alphabet had 22 letters (varying to 20 with the Y and Z, from the Greek, coming and going). These early Italians felt that the C could be used instead of the Greek G which is why we have the ABC's instead of the ABG's. Later the Romans decided to import some Greek words which required adding the G back in and also the Y and Z. During the middle ages, German words crept into the language requiring a W. In the 16th century it was determined that the I and V were being used as both consonants and vowels so the J and U were added. Later in the 16th century cursive writing came into high use and there was a problem that the V was being used a lot of words but it was hard to write cursively so the V and U were swapped to make writing easier. You can still find V's on some buildings that are really pronounced as U's today. This also accounts for the W looking like two V's except when written cursively when it is two U's.
Alphabets became the choice for writing and developed specific changes that are not related to phonics. These changes, such as lower case and upper case as well as punctuation, are specific to typography and ease of use. These were introduced during the reign of Charlemagne by Alcuin of York in 778 to ease copying Bibles, add more accuracy, and make reading easier. He was the leading scholar and teacher at the Carolingian court.
The Alphabet is a subset of character sets which includes the alphabet, numbers, punctuation, and special purpose characters and symbols. Many languages have adopted the Latin alphabet even though it does not meet their phonic needs. They modify the alphabet with accent symbols to provide phonic corrections. This was initially instituted by the romance languages (French, Italian, Spanish) as they were based earlier on the Latin language. French has the same 26 letters as English. Italian has only 21 letters. They do not have j, k, w, x or y in their alphabet but they are sometimes seen in adopted foreign words. Spanish has officially 27 letters adding the Ñ which is Unicode \U00D1 and ñ is \U00F1.
[edit] Greek
In its modern form, Greek is the official language of Greece and Cyprus and one of the 24 official languages of the European Union. It is spoken by at least 13.5 million people today in Greece, Cyprus, Italy, Albania, Turkey, and the many other countries of the Greek diaspora. It is also used in mathematics to represent values in formulas. This alphabet contains 24 characters.
In the table below the decimal entity number needs a & in front. The page HTML entities contains the Greek alphabet with its entity names.
ucase | decimal | Unicode | name | lcase | decimal | unicode |
---|---|---|---|---|---|---|
Α | #913; | \U0391 | ALPHA | α | #945; | \U03B1 |
Β | #914; | \U0392 | BETA | β | #946; | \U03B2 |
Γ | #915; | \U0393 | GAMMA | γ | #947; | \U03B3 |
Δ | #916; | \U0394 | DELTA | δ | #948; | \U03B4 |
Ε | #917; | \U0395 | EPSILON | ε | #949; | \U03B5 |
Ζ | #918; | \U0396 | ZETA | ζ | #950; | \U03B6 |
Η | #919; | \U0397 | ETA | η | #951; | \U03B7 |
Θ | #920; | \U0398 | THETA | θ | #952; | \U03B8 |
Ι | #921; | \U0399 | IOTA | ι | #953; | \U03B9 |
Κ | #922; | \U039A | KAPPA | κ | #954; | \U03BA |
Λ | #923; | \U039B | LAMBDA | λ | #955; | \U03BB |
Μ | #924; | \U039C | MU | μ | #956; | \U03BC |
Ν | #925; | \U039D | NU | ν | #957; | \U03BD |
Ξ | #926; | \U039E | XI | ξ | #958; | \U03BE |
Ο | #927; | \U039F | OMICRON | ο | #959; | \U03BF |
Π | #928; | \U03A0 | PI | π | #960; | \U03C0 |
Ρ | #929; | \U03A1 | RHO | ρ | #961; | \U03C1 |
Final Sigma | ς | #962; | \U03C2 | |||
Σ | #931; | \U03A3 | SIGMA | σ | #963; | \U03C3 |
Τ | #932; | \U03A4 | TAU | τ | #964; | \U03C4 |
Υ | #933; | \U03A5 | UPSILON | υ | #965; | \U03C5 |
Φ | #934; | \U03A6 | PHI | φ | #966; | \U03C6 |
Χ | #935; | \U03A7 | CHI | χ | #967; | \U03C7 |
Ψ | #936; | \U03A8 | PSI | ψ | #968; | \U03C8 |
Ω | #937; | \U03A9 | OMEGA | ω | #969; | \U03C9 |
Theta symbol | ϑ | #977; | \U03D1 | |||
Digamma | ϝ | #989; | \U03DD |
For variations and ancient forms see: Greek Unicode Entities at Penn State University. An online keyboard is available at https://www.lexilogos.com/keyboard/greek_ancient.htm. You can simply click on the characters to create the text (or use equivalents on your keyboard) and then capture the entire session to paste where you need it. Note that this includes accent marks. You can also use the Latin keyboard to type Greek using https://www.translatum.gr/converter/beta-code.htm. Note that this alphabet, like the Latin, has lower case that differs from upper case by a single binary digit (0010 0000).
[edit] Hebrew
The characters are right to left making entry difficult for mixed text on a line. Easier entry method is to use the decimal or Unicode entries. The & is need to make the decimal numbers work. Note that there are several end of word variations of a letter. This was due to not having spaces between words initially. The basic alphabet contains 22 letters, all consonants.
letter | decimal | Unicode | name | equivalent |
---|---|---|---|---|
א | #1488; | \U05D0 | ALEPH | a |
ב | #1489; | \U05D1 | BETH | b |
ג | #1490; | \U05D2 | GIMEL | g |
ד | #1491; | \U05D3 | DALET | d |
ה | #1492; | \U05D4 | HE | h |
ו | #1493; | \U05D5 | VAV | v |
ז | #1494; | \U05D6 | ZAYIN | z |
ח | #1495; | \U05D7 | HET | H |
ט | #1496; | \U05D8 | TET | T |
י | #1497; | \U05D9 | YOD | y |
ך | #1498; | \U05DA | final KAF | |
כ | #1499; | \U05DB | KAF | k |
ל | #1500; | \U05DC | LAMED | l |
ם | #1501; | \U05DD | final MEM | |
מ | #1502; | \U05DE | MEM | m |
ן | #1503; | \U05DF | final NUN | |
נ | #1504; | \U05E0 | NUN | n |
ס | #1505; | \U05E1 | SAMEKH | s |
ע | #1506; | \U05E2 | AYIN | j |
ף | #1507; | \U05E3 | final PE | |
פ | #1508; | \U05E4 | PE | p |
ץ | #1509; | \U05E5 | final TSADI | |
צ | #1510; | \U05E6 | TSADI | ts |
ק | #1511; | \U05E7 | QOF | q |
ר | #1512; | \U05E8 | RESH | r |
ת | #1514; | \U05EA | TAV | t |
שׁ | #64298; | \UFB2A | SINH with shin dot | sh |
שׂ | #64299; | \UFB2B | SINH with sin dot | s |
[edit] Western with Script glyphs
Normally the western alphabet variations in glyphs are done by using a different set of fonts controlling such differences as serif or sans-serif. However there are also completely different Unicode designations for certain representations of the western alphabet. Note that this collection is taken from Named character references, where each name includes the letters scr follow by a semicolon. It could be used in an HTML5 based document, but they will not sort properly. To make them sort properly you must use symbols in the range U+1D49C – U+1D4CF. For example ℬ should be U+1D49D (). There is no clean solution.
UName | Unicode | Letter | lname | Unicode | letter |
---|---|---|---|---|---|
Ascr; | U+1D49C | 𝒜 | ascr; | U+1D4B6 | 𝒶 |
Bscr; | U+0212C | ℬ | bscr; | U+1D4B7 | 𝒷 |
Cscr; | U+1D49E | 𝒞 | cscr; | U+1D4B8 | 𝒸 |
Dscr; | U+1D49F | 𝒟 | dscr; | U+1D4B9 | 𝒹 |
Escr; | U+02130 | ℰ | escr; | U+0212F | ℯ |
Fscr; | U+02131 | ℱ | fscr; | U+1D4BB | 𝒻 |
Gscr; | U+1D4A2 | 𝒢 | gscr; | U+0210A | ℊ |
Hscr; | U+0210B | ℋ | hscr; | U+1D4BD | 𝒽 |
Iscr; | U+02110 | ℐ | iscr; | U+1D4BE | 𝒾 |
Jscr; | U+1D4A5 | 𝒥 | jscr; | U+1D4BF | 𝒿 |
Kscr; | U+1D4A6 | 𝒦 | kscr; | U+1D4C0 | 𝓀 |
Lscr; | U+02112 | ℒ | lscr; | U+1D4C1 | 𝓁 |
Mscr; | U+02133 | ℳ | mscr; | U+1D4C2 | 𝓂 |
Nscr; | U+1D4A9 | 𝒩 | nscr; | U+1D4C3 | 𝓃 |
Oscr; | U+1D4AA | 𝒪 | oscr; | U+02134 | ℴ |
Pscr; | U+1D4AB | 𝒫 | pscr; | U+1D4C5 | 𝓅 |
Qscr; | U+1D4AC | 𝒬 | qscr; | U+1D4C6 | 𝓆 |
Rscr; | U+0211B | ℛ | rscr; | U+1D4C7 | 𝓇 |
Sscr; | U+1D4AE | 𝒮 | sscr; | U+1D4C8 | 𝓈 |
Tscr; | U+1D4AF | 𝒯 | tscr; | U+1D4C9 | 𝓉 |
Uscr; | U+1D4B0 | 𝒰 | uscr; | U+1D4CA | 𝓊 |
Vscr; | U+1D4B1 | 𝒱 | vscr; | U+1D4CB | 𝓋 |
Wscr; | U+1D4B2 | 𝒲 | wscr; | U+1D4CC | 𝓌 |
Xscr; | U+1D4B3 | 𝒳 | xscr; | U+1D4CD | 𝓍 |
Yscr; | U+1D4B4 | 𝒴 | yscr; | U+1D4CE | 𝓎 |
Zscr; | U+1D4B5 | 𝒵 | zscr; | U+1D4CF | 𝓏 |
[edit] Western with OPF glyphs
Note that this collection is taken from Named character references, where each name includes the letters opf followed by a semicolon. It could be used in an HTML5 based document, but they will not sort properly. To make them sort properly you must use symbols in the range U+1D538 – U+1D56B. For example ℂ should be U+1D53A (). There is no clean solution. These are found in Unicode grouped under Mathematical Alphanumeric Symbols.
UName | Unicode | Letter | lname | Unicode | letter |
---|---|---|---|---|---|
Aopf; | U+1D538 | 𝔸 | aopf; | U+1D552 | 𝕒 |
Bopf; | U+1D539 | 𝔹 | bopf; | U+1D553 | 𝕓 |
Copf; | U+02102 | ℂ | copf; | U+1D554 | 𝕔 |
Dopf; | U+1D53B | 𝔻 | dscr; | U+1D555 | 𝕕 |
Eopf; | U+1D53C | 𝔼 | eopf; | U+1D556 | 𝕖 |
Fopf; | U+1D53D | 𝔽 | fopf; | U+1D557 | 𝕗 |
Gopf; | U+1D53E | 𝔾 | gopf; | U+1D558 | 𝕘 |
Hopf; | U+0210D | ℍ | hopf; | U+1D559 | 𝕙 |
Iopf; | U+1D540 | 𝕀 | iopf; | U+1D55A | 𝕚 |
Jopf; | U+1D541 | 𝕁 | jopf; | U+1D55B | 𝕛 |
Kopf; | U+1D542 | 𝕂 | kopf; | U+1D55C | 𝕜 |
Lopf; | U+1D543 | 𝕃 | lopf; | U+1D55D | 𝕝 |
Mopf; | U+1D544 | 𝕄 | mopf; | U+1D55E | 𝕞 |
Nopf; | U+02115 | ℕ | nopf; | U+1D55F | 𝕟 |
Oopf; | U+1D546 | 𝕆 | oopf; | U+1D560 | 𝕠 |
Popf; | U+02119 | ℙ | popf; | U+1D561 | 𝕡 |
Qopf; | U+0211A | ℚ | qopf; | U+1D562 | 𝕢 |
Ropf; | U+0211D | ℝ | ropf; | U+1D563 | 𝕣 |
Sopf; | U+1D54A | 𝕊 | sopf; | U+1D564 | 𝕤 |
Topf; | U+1D54B | 𝕋 | topf; | U+1D565 | 𝕥 |
Uopf; | U+1D54C | 𝕌 | uopf; | U+1D566 | 𝕦 |
Vopf; | U+1D54D | 𝕍 | vopf; | U+1D567 | 𝕧 |
Wopf; | U+1D54E | 𝕎 | wopf; | U+1D568 | 𝕨 |
Xopf; | U+1D54F | 𝕏 | xopf; | U+1D569 | 𝕩 |
Yopf; | U+1D550 | 𝕐 | yopf; | U+1D56A | 𝕪 |
Zopf; | U+02124 | ℤ | zopf; | U+1D56B | 𝕫 |
[edit] Western with FR glyphs
Note that this collection is taken from Named character references, where each name includes the letters fr followed by a semicolon. It could be used in an HTML5 based document, but they will not sort properly. To make them sort properly you must use symbols in the range U+1D504 – U+1D537. For example ℭ should be U+1D506 (). There is no clean solution.
UName | Unicode | Letter | lname | Unicode | letter |
---|---|---|---|---|---|
Afr; | U+1D504 | 𝔄 | afr; | U+1D51E | 𝔞 |
Bfr; | U+1D505 | 𝔅 | bfr; | U+1D51F | 𝔟 |
Cfr; | U+0212D | ℭ | cfr; | U+1D520 | 𝔠 |
Dfr; | U+1D507 | 𝔇 | dfr; | U+1D521 | 𝔡 |
Efr; | U+1D508 | 𝔈 | efr; | U+1D522 | 𝔢 |
Ffr; | U+1D509 | 𝔉 | ffr; | U+1D523 | 𝔣 |
Gfr; | U+1D50A | 𝔊 | gfr; | U+1D524 | 𝔤 |
Hfr; | U+0210C | ℌ | hfr; | U+1D525 | 𝔥 |
Ifr; | U+02111 | ℑ | ifr; | U+1D526 | 𝔦 |
Jfr; | U+1D50D | 𝔍 | jfr; | U+1D527 | 𝔧 |
Kfr; | U+1D50E | 𝔎 | kfr; | U+1D528 | 𝔨 |
Lfr; | U+1D50F | 𝔏 | lfr; | U+1D529 | 𝔩 |
Mfr; | U+1D510 | 𝔐 | mfr; | U+1D52A | 𝔪 |
Nfr; | U+1D511 | 𝔑 | nfr; | U+1D52B | 𝔫 |
Ofr; | U+1D512 | 𝔒 | ofr; | U+1D52C | 𝔬 |
Pfr; | U+1D513 | 𝔓 | pfr; | U+1D52D | 𝔭 |
Qfr; | U+1D514 | 𝔔 | qfr; | U+1D52E | 𝔮 |
Rfr; | U+0211C | ℜ | rfr; | U+1D52F | 𝔯 |
Sfr; | U+1D516 | 𝔖 | sfr; | U+1D530 | 𝔰 |
Tfr; | U+1D517 | 𝔗 | tfr; | U+1D531 | 𝔱 |
Ufr; | U+1D518 | 𝔘 | ufr; | U+1D532 | 𝔲 |
Vfr; | U+1D519 | 𝔙 | vfr; | U+1D533 | 𝔳 |
Wfr; | U+1D51A | 𝔚 | wfr; | U+1D534 | 𝔴 |
Xfr; | U+1D51B | 𝔛 | xfr; | U+1D535 | 𝔵 |
Yfr; | U+1D51C | 𝔜 | yfr; | U+1D536 | 𝔶 |
Zfr; | U+02128 | ℨ | zfr; | U+1D537 | 𝔷 |