How binary represents numbers
Binary is base-2 — each digit (bit) is either 0 or 1. To understand binary values, compare with decimal (base-10):
- In decimal, each position is a power of 10: ones, tens, hundreds...
- In binary, each position is a power of 2: ones, twos, fours, eights, sixteens...
| Binary | Calculation | Decimal |
|---|---|---|
| 0000 0001 | 1 | 1 |
| 0000 0010 | 2 | 2 |
| 0100 0001 | 64 + 1 | 65 |
| 0110 0001 | 64 + 32 + 1 | 97 |
| 1111 1111 | 128+64+32+16+8+4+2+1 | 255 |
8 bits = 1 byte. A byte can represent 256 values (0–255). This is why bytes are the fundamental unit of computer data.
ASCII: the original text encoding
ASCII (American Standard Code for Information Interchange) was developed in 1963 and assigns a number to each printable character and control code. It uses 7 bits (128 values, 0–127):
- 0–31: Control characters (carriage return, newline, tab...)
- 32: Space
- 48–57: Digits 0–9
- 65–90: Uppercase A–Z (65 = 'A', 90 = 'Z')
- 97–122: Lowercase a–z (97 = 'a', 122 = 'z')
So 'A' = 65 decimal = 01000001 binary. 'a' = 97 decimal = 01100001 binary. Notice that lowercase letters are exactly 32 more than their uppercase equivalents — this is why toggling bit 5 (decimal 32) switches between upper and lowercase in ASCII.
The word "Hello" in binary:
H = 72 = 01001000
e = 101 = 01100101
l = 108 = 01101100
l = 108 = 01101100
o = 111 = 01101111Convert any text to binary (and back) with the free binary to text converter.
Extended ASCII and code pages
ASCII only covers English characters. To support other Western European languages (é, ü, ñ...), "extended ASCII" used the 8th bit to add 128 more characters (128–255). Different countries used different "code pages" — mapping numbers 128–255 to their local characters.
The problem: a document encoded with ISO-8859-1 (Western European) would display as garbage on a system using ISO-8859-5 (Cyrillic). This "character encoding mismatch" was a persistent headache through the 1990s.
Unicode: the universal standard
Unicode assigns a unique code point to every character in every human writing system — over 149,000 characters as of Unicode 15 (2022). Code points are written as U+XXXX (e.g., U+0041 = 'A', U+00E9 = 'é', U+1F600 = '😀').
Unicode defines the characters. UTF-8 (the most common encoding) is how those code points are stored as bytes:
- Code points 0–127: stored as 1 byte (identical to ASCII — backwards compatible)
- Code points 128–2047: stored as 2 bytes
- Code points 2048–65535: stored as 3 bytes
- Code points 65536+: stored as 4 bytes (emoji, rare historic scripts)
UTF-8's backwards compatibility with ASCII means ASCII-encoded text is also valid UTF-8 — which made the transition from ASCII to Unicode smooth for most systems.
Binary in programming contexts
You rarely write raw binary in code — programmers use hexadecimal (hex) instead. Hex (base-16) uses digits 0–9 and A–F, and each hex digit represents exactly 4 bits (one nibble). This makes the mapping to bytes natural:
- 1 byte = 2 hex digits = 8 binary digits
FF(hex) =1111 1111(binary) = 255 (decimal)41(hex) =0100 0001(binary) = 65 (decimal) = 'A' (ASCII)
Memory addresses, color codes (CSS #FF6B6B), cryptographic hashes, and network protocols are all typically expressed in hexadecimal.
Why this matters practically
Understanding binary-to-text encoding matters when:
- Debugging character encoding issues ("mojibake" — garbled text from encoding mismatches)
- Working with binary file formats or network protocols at the byte level
- Understanding why emoji take more bytes than ASCII characters in databases (a UTF-8 emoji is 4 bytes; a letter is 1 byte)
- Investigating why a database column is "too long" when the character count looks fine (byte count vs. character count)
Related tools
- Free Binary to Text Converter — convert between binary, text, hex, and decimal
- Free Base64 Encoder — encode binary data as text (different encoding)
Written by Achraf A., founder of TheFreeAITools.