Base64

From MobileRead
Jump to: navigation, search

Base64 is an encoding scheme to allow binary files to be represented using ASCII text. This is used for images in email, for example.

[edit] Overview

Base64 uses all printable characters to encode the data. There are 64 characters used which is where the name comes from. 64 characters need 6 bits of binary data to represent them and thus an 8 bit binary byte of data must be expanded in size. 4 characters are used to represent 3 bytes of data thus increasing the file size by 33%. The actual increase is more like 37% for email due to the need to limit line lengths to 72 characters adding line end and line feed characters plus the inclusion of header data.

Base64 is used to transmit binary data (such as ZIP files and images) inside of email documents, for RTF images, AbiWord (ABW) images, and FB2 images. It is also used in Data URI to allow images to be included inside an HTML and XML document. It is a MIME (Multipurpose Internet Mail Extensions) encoding standard for data transfer. The mapping usually uses the uppercase letters, the lowercase letters, and 0-9. However, this only gets to 62 characters. The last two vary depending on the implementation with + and / as popular favorites. Since there is no guarantee that there will be an exact match of the number of bytes to the number of characters the "=" sign is used to pad the field when needed. When the decode encounters the "=" it knows the input data has ended. Data can be split over multiple files and concatenated together prior to the decode. Some tools automatically split large files.

[edit] The Base64 index table:

Value Char   Value Char   Value Char   Value Char
0 A 16 Q 32 g 48 w
1 B 17 R 33 h 49 x
2 C 18 S 34 i 50 y
3 D 19 T 35 j 51 z
4 E 20 U 36 k 52 0
5 F 21 V 37 l 53 1
6 G 22 W 38 m 54 2
7 H 23 X 39 n 55 3
8 I 24 Y 40 o 56 4
9 J 25 Z 41 p 57 5
10 K 26 a 42 q 58 6
11 L 27 b 43 r 59 7
12 M 28 c 44 s 60 8
13 N 29 d 45 t 61 9
14 O 30 e 46 u 62 +
15 P 31 f 47 v 63 /

[edit] Similar Coding schemes

Other attempts to code 8 bit and even Unicode data into 7 bit ASCII include:

Personal tools
Namespaces

Variants
Actions
Navigation
MobileRead Networks
Toolbox