Windows-1251 (also known as code page CP1251) is a popular 8-bit character code, designed to cover languages that use the Cyrillic alphabet.
[edit] Introduction
The original ASCII code was designed to work in 7 bits which offers
128 separate characters. The first 32 (0 - 31) were reserved for control codes but most of the rest are printable. The character 127 was defined to implement the backspace/delete functionality but on some devices it will be shown as a small box if coded. Since most items in a computer are stored in bytes with its 8 bits there are another 128 characters that could be used. The Windows-1251 code is designed to use these codes for the Cyrillic alphabet and is much more commonly used than the ISO 8859-5 standard intended for the same purpose but was never really adopted by these users. (It is missing some Ukrainian characters.) In the future, both may eventually give way to Unicode which is the preferred character set.
[edit] 1251 Code page layout
The following table shows Windows-1251. Each character is shown with its decimal code and its Unicode equivalent.
Note that the word codes shown are for reference. They will not normally generate these values but will likely generate the equivalent ISO or UTF-8 values depending on the reader (see special characters). The full Windows-1251 includes ASCII values. The ones shown here are unique to this coding.
UTF-8 |
Character |
Code |
Description[1]
|
U+0402 |
Ђ |
128 |
|
U+0403 |
Ѓ |
129 |
|
U+201A |
‚ |
130 |
single low 9 quote
|
U+0453 |
ѓ |
131 |
|
U+201E |
„ |
132 |
double low 9 quote
|
U+2036 |
… |
133 |
ellipse
|
U+2020 |
† |
134 |
dagger
|
U+2021 |
‡ |
135 |
Double Dagger
|
U+20AC |
€ |
136 |
Euro sign[2]
|
U+2030 |
‰ |
137 |
per mille
|
U+0409 |
Љ |
138 |
|
U+2039 |
‹ |
139 |
left arrow quote
|
U+040A |
Њ |
140 |
|
U+040C |
Ќ |
141 |
|
U+040B |
Ћ |
142 |
|
U+040F |
Џ |
143 |
|
U+0452 |
ђ |
144 |
|
U+2018 |
‘ |
145 |
left single curly quote
|
U+2019 |
’ |
146 |
right single curly quote
|
U+201C |
“ |
147 |
left double curly quote
|
U+201D |
” |
148 |
right double curly quote
|
U+2022 |
• |
149 |
bullet
|
U+2013 |
– |
150 |
normal dash
|
U+2014 |
— |
151 |
wide dash
|
|
|
152 |
undefined[2]
|
U+2122 |
™ |
153 |
trade mark
|
U+0459 |
љ |
154 |
|
U+203A |
› |
155 |
right arrow quote
|
U+045A |
њ |
156 |
|
U+045C |
ќ |
157 |
|
U+045B |
ћ |
158 |
|
U+045F |
џ |
159 |
|
U+00A0 |
|
160 |
NBSP
|
U+040E |
Ў |
161 |
|
U+045E |
ў |
162 |
|
U+0408 |
Ј |
163 |
|
U+00A4 |
¤ |
164 |
currency
|
U+0490 |
Ґ |
165 |
|
U+00A6 |
¦ |
166 |
broken vertical bar
|
U+00A7 |
§ |
167 |
section sign
|
U+0401 |
Ё |
168 |
|
U+00A9 |
© |
169 |
copyright
|
U+0404 |
Є |
170 |
|
U+00AB |
« |
171 |
left angle quote
|
U+00AC |
¬ |
172 |
not sign
|
U+00AD |
|
173 |
SHY (soft hyphen)
|
U+00AE |
® |
174 |
registered trademark
|
U+0407 |
Ї |
175 |
|
U+00B0 |
° |
176 |
Degree sign
|
U+00B1 |
± |
177 |
Plus-minus sign
|
U+0406 |
І |
178 |
|
U+0456 |
і |
179 |
|
U+0491 |
ґ |
180 |
|
U+00B5 |
µ |
181 |
Micro sign
|
U+00B6 |
¶ |
182 |
paragraph sign
|
U+00B7 |
· |
183 |
middle dot
|
U+0451 |
ё |
184
|
U+2116 |
№ |
185 |
Numero sign[3]
|
U+0454 |
є |
186 |
|
U+00BB |
» |
187 |
right angle quote
|
U+0458 |
ј |
188 |
|
U+0405 |
Ѕ |
189 |
|
U+0455 |
ѕ |
190 |
|
U+0457 |
ї |
191 |
|
|
UTF-8 |
Character |
Code |
Name
|
U+0410 |
А |
192 |
A
|
U+0411 |
Б |
193 |
Be
|
U+0412 |
В |
194 |
Ve
|
U+0413 |
Г |
195 |
Ge
|
U+0414 |
Д |
196 |
De
|
U+0415 |
Е |
197 |
E
|
U+0416 |
Ж |
198 |
Zhe
|
U+0417 |
З |
199 |
Ze
|
U+0418 |
И |
200 |
I
|
U+0419 |
Й |
201 |
short I
|
U+041A |
К |
202 |
Ka
|
U+041B |
Л |
203 |
El
|
U+041C |
М |
204 |
Em
|
U+041D |
Н |
205 |
En/Ne
|
U+041E |
О |
206 |
O
|
U+041F |
П |
207 |
Pe
|
U+0420 |
Р |
208 |
Er/Re
|
U+0421 |
С |
209 |
Es
|
U+0422 |
Т |
210 |
Te
|
U+0423 |
У |
211 |
U
|
U+0424 |
Ф |
212 |
Ef/Fe
|
U+0425 |
Х |
213 |
Kha
|
U+0426 |
Ц |
214 |
Tse
|
U+0427 |
Ч |
215 |
Che
|
U+0428 |
Ш |
216 |
Sha
|
U+0429 |
Щ |
217 |
Shcha, Shta
|
U+042A |
Ъ |
218 |
soft sign or small yer
|
U+042B |
Ы |
219 |
* Russian
|
U+042C |
Ь |
220 |
* Russian
|
U+042D |
Э |
221 |
* Russian
|
U+042E |
Ю |
222 |
Yu
|
U+042F |
Я |
223 |
Ya
|
U+0430 |
а |
224 |
|
U+0431 |
б |
225 |
|
U+0432 |
в |
226 |
|
U+0433 |
г |
227 |
|
U+0434 |
д |
228 |
|
U+0435 |
е |
229 |
|
U+0436 |
ж |
230 |
|
U+0437 |
з |
231 |
|
U+0438 |
и |
232 |
|
U+0439 |
й |
233 |
|
U+043A |
к |
234 |
|
U+043B |
л |
235 |
|
U+04eC |
м |
236 |
|
U+043D |
н |
237 |
|
U+043E |
о |
238 |
|
U+043F |
п |
239 |
|
U+0440 |
р |
240 |
|
U+0441 |
с |
241 |
|
U+0442 |
т |
242 |
|
U+0443 |
у |
243 |
|
U+0444 |
ф |
244 |
|
U+0445 |
х |
245 |
|
U+0446 |
ц |
246 |
|
U+0447 |
ч |
247 |
|
U+0448 |
ш |
248 |
|
U+0449 |
щ |
249 |
|
U+044A |
ъ |
250 |
|
U+044B |
ы |
251 |
|
U+044C |
ь |
252 |
|
U+044D |
э |
253 |
|
U+044E |
ю |
254 |
|
U+044F |
я |
255 |
|
|
- ↑ Unless otherwise indicated the codes listed in the description are the same as Windows-1252 and/or ISO-8859-1.
- ↑ 2.0 2.1 Not the same code as Windows-1252.
- ↑ Not the same as ISO-8859-1.
[edit] Coverage
- All Cyrillic alphabets such as the Russian language. These include
-
- Bulgarian
- Belorussian
- Russian
- Macedonian
- Serbian Cyrillic
- Ukrainian
|
- Azeri
- Kyrgyz
- Mongolian
- Tatar
- Uzbek
|
It is the most widely used character set for encoding the Bulgarian language, Serbian language and Macedonian language.
[edit] Language specific characters
- Ukrainian and Belorussian characters: there are four special letters Ґ, Є, І, and Ї in both upper and lower case.
- Serbian and Macedonian characters: there are five special character Ђ, Љ, Њ, Ћ, and Џ in both upper and lower case.
[edit] For more information