The Base32 encoding
Base32 is an encoding comprised of 32 separate characters. Generally, Base32 represents byte strings, in the same way that hexadecimal might. Base32 has a number of advantages as an encoding. For one, it is much more space efficient than hexadecimal, taking up about 20% of the space that hexadecimal might.
It also:
- is entirely one case, making dictation over the phone or human memory easier
- can be written into a URL without percent-encoding
- can be a filename
- often omits easily confusable character pairs (1 & I, 0 & O, 8 & B)
This makes Base32 a great choice when raw data needs to be passed between people. However, its cousin, Base64, is more effective when arbitrary data needs to be represented using a limited character set because of its efficient use of space.
Padding & the alphabet
In some Base32 implementations, a padding character is provisioned to ensure that the Base32 string is the correct length. However, padding is omitted in other implementations since it can often be inferred from the source string.
A Base32 alphabet uses a set of 32 digits to represent 5 bit values (25). An alphabet can be designed to maximize performance and readability, ensure cohesion with other systems, or be otherwise tailored for a specific use case. Pictured below is the original RFC 4648 Base32 alphabet, though other popular alphabets include z-base-32, Crockford's Base32, and base32hex.
Value | Mark | Value | Mark | Value | Mark | Value | Mark | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
0 | A | 8 | I | 16 | Q | 24 | Y | ||||
1 | B | 9 | J | 17 | R | 25 | Z | ||||
2 | C | 10 | K | 18 | S | 26 | 2 | ||||
3 | D | 11 | L | 19 | T | 27 | 3 | ||||
4 | E | 12 | M | 20 | U | 28 | 4 | ||||
5 | F | 13 | N | 21 | V | 29 | 5 | ||||
6 | G | 14 | O | 22 | W | 30 | 6 | ||||
7 | H | 15 | P | 23 | X | 31 | 7 | ||||
padding | = |