Alphabet

From MobileRead
Jump to: navigation, search

Alphabets are the building blocks of writing. The western (or Latin) alphabet can be seen under ASCII. The table in ISO 8859-5 shows the Cyrillic Alphabet Unicode set (also see Windows-1251). The characters can also be entered using their Named character references which was used to generate the content of this page.

Contents

[edit] Overview

The alphabet differed from earlier writing in that it was designed to capture the phonic sounds of speech and transfer them to a written form. This method allow for a common alphabet to represent more than one language and required only a few symbols. Earlier writing used symbols that represented full words or syllables which required many more symbols to capture a language and likely needed new symbols every time a new word was needed. Since they were not based on sounds per se they were locked into one or only a few related languages. Today the Latin alphabet is likely the most popular alphabet in use. Modern usage has augmented the alphabet in some languages by using accent marks to tweak the sound being represented. Eastern Europe uses the Cyrillic alphabet to capture the sounds of the languages near Asia. Several other alphabets exist that are not shown here.

[edit] History

The oldest of the phonic alphabets, shown here, is the Hebrew which differs from later alphabets in that it does not have vowels, nor does it have lower case. The symbols shown here are from the modern Hebrew alphabet which differ completely from their earlier form and were taken from the Aramaic. The Greek followed later. It included vowels and eventually both cases although initially it only had upper case and did not have spaces between words. Today most western languages use the same or similar alphabet often called the western or Latin alphabet while the characters are called ASCII. The Latin alphabet is also used by some eastern nations.

The Latin alphabet was derived from an earlier Old Italian alphabet which was derived from the Greek. The original Latin alphabet had 22 letters (varying to 20 with the Y and Z, from the Greek, coming and going). These early Italians felt that the C could be used instead of the Greek G which is why we have the ABC's instead of the ABG's. Later the Romans decided to import some Greek words which required adding the G back in and also the Y and Z. During the middle ages, German words crept into the language requiring a W. In the 16th century it was determined that the I and V were being used as both consonants and vowels so the J and U were added. Later in the 16th century cursive writing came into high use and there was a problem that the V was being used a lot of words but it was hard to write cursively so the V and U were swapped to make writing easier. You can still find V's on some buildings that are really pronounced as U's today. This also accounts for the W looking like two V's except when written cursively when it is two U's.

Alphabets became the choice for writing and developed specific changes that are not related to phonics. These changes, such as lower case and upper case as well as punctuation, are specific to typography and ease of use. These were introduced during the reign of Charlemagne by Alcuin of York in 778 to ease copying Bibles, add more accuracy, and make reading easier. He was the leading scholar and teacher at the Carolingian court.

The Alphabet is a subset of character sets which includes the alphabet, numbers, punctuation, and special purpose characters and symbols. Many languages have adopted the Latin alphabet even though it does not meet their phonic needs. They modify the alphabet with accent symbols to provide phonic corrections. This was initially instituted by the romance languages (French, Italian, Spanish) as they were based earlier on the Latin language. French has the same 26 letters as English. Italian has only 21 letters. They do not have j, k, w, x or y in their alphabet but they are sometimes seen in adopted foreign words. Spanish has officially 27 letters adding the Ñ which is Unicode \U00D1 and ñ is \U00F1.

[edit] Greek

In its modern form, Greek is the official language of Greece and Cyprus and one of the 24 official languages of the European Union. It is spoken by at least 13.5 million people today in Greece, Cyprus, Italy, Albania, Turkey, and the many other countries of the Greek diaspora. It is also used in mathematics to represent values in formulas. This alphabet contains 24 characters.

In the table below the decimal entity number needs a & in front. The page HTML entities‎‎ contains the Greek alphabet with its entity names.

ucase decimal Unicode name lcase decimal unicode
Α #913; \U0391 ALPHA α #945; \U03B1
Β #914; \U0392 BETA β #946; \U03B2
Γ #915; \U0393 GAMMA γ #947; \U03B3
Δ #916; \U0394 DELTA δ #948; \U03B4
Ε #917; \U0395 EPSILON ε #949; \U03B5
Ζ #918; \U0396 ZETA ζ #950; \U03B6
Η #919; \U0397 ETA η #951; \U03B7
Θ #920; \U0398 THETA θ #952; \U03B8
Ι #921; \U0399 IOTA ι #953; \U03B9
Κ #922; \U039A KAPPA κ #954; \U03BA
Λ #923; \U039B LAMBDA λ #955; \U03BB
Μ #924; \U039C MU μ #956; \U03BC
Ν #925; \U039D NU ν #957; \U03BD
Ξ #926; \U039E XI ξ #958; \U03BE
Ο #927; \U039F OMICRON ο #959; \U03BF
Π #928; \U03A0 PI π #960; \U03C0
Ρ #929; \U03A1 RHO ρ #961; \U03C1
Final Sigma ς #962; \U03C2
Σ #931; \U03A3 SIGMA σ #963; \U03C3
Τ #932; \U03A4 TAU τ #964; \U03C4
Υ #933; \U03A5 UPSILON υ #965; \U03C5
Φ #934; \U03A6 PHI φ #966; \U03C6
Χ #935; \U03A7 CHI χ #967; \U03C7
Ψ #936; \U03A8 PSI ψ #968; \U03C8
Ω #937; \U03A9 OMEGA ω #969; \U03C9
Theta symbol ϑ #977; \U03D1
Digamma ϝ #989; \U03DD

For variations and ancient forms see: Greek Unicode Entities at Penn State University. An online keyboard is available at https://www.lexilogos.com/keyboard/greek_ancient.htm. You can simply click on the characters to create the text (or use equivalents on your keyboard) and then capture the entire session to paste where you need it. Note that this includes accent marks. You can also use the Latin keyboard to type Greek using https://www.translatum.gr/converter/beta-code.htm. Note that this alphabet, like the Latin, has lower case that differs from upper case by a single binary digit (0010 0000).

[edit] Hebrew

The characters are right to left making entry difficult for mixed text on a line. Easier entry method is to use the decimal or Unicode entries. The & is need to make the decimal numbers work. Note that there are several end of word variations of a letter. This was due to not having spaces between words initially. The basic alphabet contains 22 letters, all consonants.

letter decimal Unicode name equivalent
א #1488; \U05D0 ALEPH a
ב #1489; \U05D1 BETH b
ג #1490; \U05D2 GIMEL g
ד #1491; \U05D3 DALET d
ה #1492; \U05D4 HE h
ו #1493; \U05D5 VAV v
ז #1494; \U05D6 ZAYIN z
ח #1495; \U05D7 HET H
ט #1496; \U05D8 TET T
י #1497; \U05D9 YOD y
ך #1498; \U05DA final KAF
כ #1499; \U05DB KAF k
ל #1500; \U05DC LAMED l
ם #1501; \U05DD final MEM
מ #1502; \U05DE MEM m
ן #1503; \U05DF final NUN
נ #1504; \U05E0 NUN n
ס #1505; \U05E1 SAMEKH s
ע #1506; \U05E2 AYIN j
ף #1507; \U05E3 final PE
פ #1508; \U05E4 PE p
ץ #1509; \U05E5 final TSADI
צ #1510; \U05E6 TSADI ts
ק #1511; \U05E7 QOF q
ר #1512; \U05E8 RESH r
ת #1514; \U05EA TAV t
שׁ #64298; \UFB2A SINH with shin dot sh
שׂ #64299; \UFB2B SINH with sin dot s

[edit] Western with Script glyphs

Normally the western alphabet variations in glyphs are done by using a different set of fonts controlling such differences as serif or sans-serif. However there are also completely different Unicode designations for certain representations of the western alphabet. Note that this collection is taken from Named character references, where each name includes the letters scr follow by a semicolon. It could be used in an HTML5 based document, but they will not sort properly. To make them sort properly you must use symbols in the range U+1D49C – U+1D4CF. For example ℬ should be U+1D49D (𝒝). There is no clean solution.

UName Unicode Letter lname Unicode letter
Ascr; U+1D49C 𝒜 ascr; U+1D4B6 𝒶
Bscr; U+0212C bscr; U+1D4B7 𝒷
Cscr; U+1D49E 𝒞 cscr; U+1D4B8 𝒸
Dscr; U+1D49F 𝒟 dscr; U+1D4B9 𝒹
Escr; U+02130 escr; U+0212F
Fscr; U+02131 fscr; U+1D4BB 𝒻
Gscr; U+1D4A2 𝒢 gscr; U+0210A
Hscr; U+0210B hscr; U+1D4BD 𝒽
Iscr; U+02110 iscr; U+1D4BE 𝒾
Jscr; U+1D4A5 𝒥 jscr; U+1D4BF 𝒿
Kscr; U+1D4A6 𝒦 kscr; U+1D4C0 𝓀
Lscr; U+02112 lscr; U+1D4C1 𝓁
Mscr; U+02133 mscr; U+1D4C2 𝓂
Nscr; U+1D4A9 𝒩 nscr; U+1D4C3 𝓃
Oscr; U+1D4AA 𝒪 oscr; U+02134
Pscr; U+1D4AB 𝒫 pscr; U+1D4C5 𝓅
Qscr; U+1D4AC 𝒬 qscr; U+1D4C6 𝓆
Rscr; U+0211B rscr; U+1D4C7 𝓇
Sscr; U+1D4AE 𝒮 sscr; U+1D4C8 𝓈
Tscr; U+1D4AF 𝒯 tscr; U+1D4C9 𝓉
Uscr; U+1D4B0 𝒰 uscr; U+1D4CA 𝓊
Vscr; U+1D4B1 𝒱 vscr; U+1D4CB 𝓋
Wscr; U+1D4B2 𝒲 wscr; U+1D4CC 𝓌
Xscr; U+1D4B3 𝒳 xscr; U+1D4CD 𝓍
Yscr; U+1D4B4 𝒴 yscr; U+1D4CE 𝓎
Zscr; U+1D4B5 𝒵 zscr; U+1D4CF 𝓏

[edit] Western with OPF glyphs

Note that this collection is taken from Named character references, where each name includes the letters opf followed by a semicolon. It could be used in an HTML5 based document, but they will not sort properly. To make them sort properly you must use symbols in the range U+1D538 – U+1D56B. For example ℂ should be U+1D53A (𝔺). There is no clean solution. These are found in Unicode grouped under Mathematical Alphanumeric Symbols.

UName Unicode Letter lname Unicode letter
Aopf; U+1D538 𝔸 aopf; U+1D552 𝕒
Bopf; U+1D539 𝔹 bopf; U+1D553 𝕓
Copf; U+02102 copf; U+1D554 𝕔
Dopf; U+1D53B 𝔻 dscr; U+1D555 𝕕
Eopf; U+1D53C 𝔼 eopf; U+1D556 𝕖
Fopf; U+1D53D 𝔽 fopf; U+1D557 𝕗
Gopf; U+1D53E 𝔾 gopf; U+1D558 𝕘
Hopf; U+0210D hopf; U+1D559 𝕙
Iopf; U+1D540 𝕀 iopf; U+1D55A 𝕚
Jopf; U+1D541 𝕁 jopf; U+1D55B 𝕛
Kopf; U+1D542 𝕂 kopf; U+1D55C 𝕜
Lopf; U+1D543 𝕃 lopf; U+1D55D 𝕝
Mopf; U+1D544 𝕄 mopf; U+1D55E 𝕞
Nopf; U+02115 nopf; U+1D55F 𝕟
Oopf; U+1D546 𝕆 oopf; U+1D560 𝕠
Popf; U+02119 popf; U+1D561 𝕡
Qopf; U+0211A qopf; U+1D562 𝕢
Ropf; U+0211D ropf; U+1D563 𝕣
Sopf; U+1D54A 𝕊 sopf; U+1D564 𝕤
Topf; U+1D54B 𝕋 topf; U+1D565 𝕥
Uopf; U+1D54C 𝕌 uopf; U+1D566 𝕦
Vopf; U+1D54D 𝕍 vopf; U+1D567 𝕧
Wopf; U+1D54E 𝕎 wopf; U+1D568 𝕨
Xopf; U+1D54F 𝕏 xopf; U+1D569 𝕩
Yopf; U+1D550 𝕐 yopf; U+1D56A 𝕪
Zopf; U+02124 zopf; U+1D56B 𝕫

[edit] Western with FR glyphs

Note that this collection is taken from Named character references, where each name includes the letters fr followed by a semicolon. It could be used in an HTML5 based document, but they will not sort properly. To make them sort properly you must use symbols in the range U+1D504 – U+1D537. For example ℭ should be U+1D506 (𝔆). There is no clean solution.

UName Unicode Letter lname Unicode letter
Afr; U+1D504 𝔄 afr; U+1D51E 𝔞
Bfr; U+1D505 𝔅 bfr; U+1D51F 𝔟
Cfr; U+0212D cfr; U+1D520 𝔠
Dfr; U+1D507 𝔇 dfr; U+1D521 𝔡
Efr; U+1D508 𝔈 efr; U+1D522 𝔢
Ffr; U+1D509 𝔉 ffr; U+1D523 𝔣
Gfr; U+1D50A 𝔊 gfr; U+1D524 𝔤
Hfr; U+0210C hfr; U+1D525 𝔥
Ifr; U+02111 ifr; U+1D526 𝔦
Jfr; U+1D50D 𝔍 jfr; U+1D527 𝔧
Kfr; U+1D50E 𝔎 kfr; U+1D528 𝔨
Lfr; U+1D50F 𝔏 lfr; U+1D529 𝔩
Mfr; U+1D510 𝔐 mfr; U+1D52A 𝔪
Nfr; U+1D511 𝔑 nfr; U+1D52B 𝔫
Ofr; U+1D512 𝔒 ofr; U+1D52C 𝔬
Pfr; U+1D513 𝔓 pfr; U+1D52D 𝔭
Qfr; U+1D514 𝔔 qfr; U+1D52E 𝔮
Rfr; U+0211C rfr; U+1D52F 𝔯
Sfr; U+1D516 𝔖 sfr; U+1D530 𝔰
Tfr; U+1D517 𝔗 tfr; U+1D531 𝔱
Ufr; U+1D518 𝔘 ufr; U+1D532 𝔲
Vfr; U+1D519 𝔙 vfr; U+1D533 𝔳
Wfr; U+1D51A 𝔚 wfr; U+1D534 𝔴
Xfr; U+1D51B 𝔛 xfr; U+1D535 𝔵
Yfr; U+1D51C 𝔜 yfr; U+1D536 𝔶
Zfr; U+02128 zfr; U+1D537 𝔷
Personal tools
Namespaces

Variants
Actions
Navigation
MobileRead Networks
Toolbox