Glossary

What Is ASCII? Character Encoding Explained

ASCII (American Standard Code for Information Interchange) is a character encoding standard published in 1963 that maps 128 characters — including uppercase and lowercase letters, digits 0–9, punctuation marks, and control characters — to integers 0–127. It became the foundation for all modern character encoding systems.

The ASCII Table

ASCII uses 7 bits, providing 128 code points. Characters 0–31 are control characters (newline, tab, etc.). Character 32 is space. Characters 48–57 are '0'–'9'. Characters 65–90 are 'A'–'Z'. Characters 97–122 are 'a'–'z'. Character 127 is DEL. The difference between uppercase and lowercase is always 32 (bit 5).

ASCII vs Unicode vs UTF-8

ASCII handles only 128 characters — no accented letters, no CJK characters, no emoji. Unicode defines over 1.1 million code points covering all human writing systems. UTF-8 is the most common Unicode encoding: it uses 1 byte for ASCII characters (retaining full compatibility) and 2–4 bytes for other characters. ASCII is a strict subset of UTF-8.

Why ASCII Still Matters

Most internet protocols (HTTP headers, email headers, JSON keys, XML tags) require ASCII-safe content. Programming identifiers in most languages must be ASCII. File format magic bytes are ASCII. DNS labels are ASCII (with Punycode for internationalized domains). Understanding ASCII is fundamental to working with binary data, network protocols, and text encoding.