← writing

How QR Codes Work

From text to two-dimensional barcode, step by step.

by gui dávid

scroll to explore

What is a QR code?

In 1994, an engineer named Masahiro Hara at Denso Wave, a subsidiary of Toyota, had a problem. The one-dimensional barcodes used to track car parts on the assembly line could only hold about 20 characters. As production systems grew more complex, with more parts, more suppliers, and more metadata per component, those 20 characters became a bottleneck. Workers had to scan multiple barcodes per part, slowing everything down.

Hara's insight was geometric. A traditional barcode encodes information along a single axis: the varying widths of vertical lines. What if you used both axes? A two-dimensional grid of black and white squares could hold orders of magnitude more data in the same physical space.

His team spent two years developing what they called the QR code, short for "Quick Response," named for its design goal: fast, reliable scanning even in a noisy factory environment. The three large squares in the corners weren't decorative; they were a deliberate engineering choice to allow scanners to find and orient the code at any angle, under any rotation, in under a second.

Denso Wave made an unusual decision: they released the QR code specification publicly and chose not to enforce their patent rights. This openness is a major reason QR codes are everywhere today, from restaurant menus to airline boarding passes to cryptocurrency wallets.

A QR code is a matrix of black and white squares called modules. Each module is a single bit of visual information: dark or light, on or off. The arrangement of these modules encodes binary data using a combination of efficient encoding schemes and polynomial error correction.

QR codes come in 40 versions. Version 1, the smallest, is a 21×21 grid. Each successive version adds 4 modules per side:

$$\text{size} = 4v + 17$$

where $v$ is the version number. So Version 2 is 25×25, Version 3 is 29×29, and Version 40, the maximum, is a dense 177×177 grid. In practice, most QR codes you encounter are Version 1 through 7. A Version 3 code (33×33 modules) with medium error correction can hold a 77-character URL, which covers nearly any link you'd want to share.

Here's how the grid scales with version number. Move the slider to see the structural patterns at each size:

The three large corner squares (finder patterns) stay the same size in every version. Small squares (alignment patterns) appear at Version 2 and multiply as the code grows.

Anatomy of a QR code

A QR code looks like noise at first glance, a scrambled grid of black and white. But it has a precise internal anatomy, and every region has a specific purpose. Understanding the anatomy is the key to understanding how the whole system works.

Finder patterns

The three large squares in the top-left, top-right, and bottom-left corners. Each is a 7×7 module pattern: a black outer border, a white ring, and a solid black 3×3 core. These give the scanner an immediate fix on the code's position, size, and rotation.

The key design insight is the 1:1:3:1:1 ratio. If you scan a horizontal or vertical line through a finder pattern, you'll always encounter this ratio of dark-light-dark-light-dark modules. This ratio is unique enough that a scanner can distinguish it from random noise and detect the code even at an angle or partially obscured. The absence of a fourth finder pattern in the bottom-right corner tells the scanner which way is "up."

Each finder pattern is surrounded by a one-module-wide separator of white modules. These ensure the scanner doesn't confuse the finder pattern's black border with adjacent dark data modules.

Timing patterns

Alternating black and white modules running between the finder patterns: one horizontal strip along row 6, one vertical strip along column 6. Think of them as ruler markings: they help the scanner figure out the exact grid spacing, which is especially important for larger codes where lens distortion could make the grid ambiguous. Without timing patterns, a scanner might miscalculate cell boundaries and read the wrong data.

Alignment patterns

Small 5×5 squares that first appear in Version 2. Larger versions have more of them, arranged in a grid pattern across the code's surface. These provide additional reference points that help the scanner correct for perspective distortion, the warping that happens when you photograph a flat QR code from an angle. Each alignment pattern is a miniature target: a black center, white ring, black border.

Format information

Two copies of a 15-bit string placed near the finder patterns. This encodes two critical parameters: the error correction level (L, M, Q, or H) and the mask pattern (0 through 7). It's protected by its own BCH error correction code, and it's duplicated for redundancy. The scanner reads format info first, because it needs to know the EC level and mask before it can decode anything else.

Version information

For Version 7 and above, an 18-bit version information string is placed in two locations: a 6×3 block below the top-right finder pattern and a 3×6 block to the right of the bottom-left finder pattern. This tells the scanner which version (grid size) it's dealing with before it tries to decode anything. Smaller versions don't need this because the scanner can infer the version from the grid size alone.

The dark module

A single module that is always black, located at a fixed position (row $(4v + 9)$, column 8, where $v$ is the version number). It exists for historical reasons in the spec and helps the scanner confirm that it's reading a valid QR code. Every QR code has exactly one.

Data and error correction

Everything else. The encoded message and its Reed-Solomon error correction codewords fill the remaining modules in a specific zigzag pattern. This is where the actual information lives: the URL, the text, the Wi-Fi password, whatever you encoded.

Quiet zone

A 4-module-wide border of empty space around the entire code. It's not part of the code itself, but it's essential: without it, a scanner can't distinguish where the QR code ends and the surrounding surface begins. This is why QR codes sometimes fail when printed too close to the edge of a sticker or next to high-contrast graphics.

Click a region to highlight it. In the "All" view, each region is color-coded.

In the All view above, each region is color-coded: finder patterns, timing patterns, alignment patterns, format info, and data (black/white). Click individual buttons to isolate each region. Notice how much of the code is actually structural overhead versus actual data.

Encoding: from text to bits

Now for the interesting part: how do we turn human-readable text into a grid of black and white squares? The process starts with converting the input into a stream of bits, and the QR specification is surprisingly clever about how it does this.

QR codes support four encoding modes, each optimized for different types of data. The encoder analyzes the input and automatically selects the most efficient mode:

Numeric mode

Mode indicator: 0001. Encodes only digits 0–9. Groups of three digits are packed into 10-bit binary numbers, pairs into 7 bits, and singles into 4 bits. This gives roughly 3.33 bits per character, remarkably efficient. A 45-digit number in numeric mode takes about 150 bits; in byte mode, it would take 360.

Alphanumeric mode

Mode indicator: 0010. Encodes a 45-character set: digits 0–9, uppercase letters A–Z, and nine symbols: space $ % * + - . / :. Each character maps to a value 0–44. Pairs of characters are encoded as 11-bit numbers using the formula $v_1 \times 45 + v_2$, giving about 5.5 bits per character. This is the mode used for most URLs (since they're typically uppercase in QR encoding).

Byte mode

Mode indicator: 0100. Encodes arbitrary bytes, typically ISO 8859-1 or UTF-8. Each byte is directly encoded as 8 bits. This is the fallback mode for anything that doesn't fit the more efficient modes, like lowercase text, special characters, or binary data.

Kanji mode

Mode indicator: 1000. Encodes Shift JIS double-byte characters at 13 bits each. Designed specifically for Japanese text, a nod to the QR code's origins.

The complete bit stream is structured as follows: a 4-bit mode indicator, a character count field (whose length varies by mode and version), the encoded data, and a 4-bit terminator (0000). After the terminator, the stream is padded to fill the available data capacity.

The character count field length depends on the version range:

Mode V1-9 V10-26 V27-40
Numeric10 bits12 bits14 bits
Alphanumeric9 bits11 bits13 bits
Byte8 bits16 bits16 bits
Kanji8 bits10 bits12 bits

Try it yourself. Type text into the box below and watch the binary encoding change in real time:

mode indicator character count encoded data terminator

Notice how the encoding changes as you type. Pure digits (like 12345) trigger numeric mode, the most compact. Uppercase text with basic symbols triggers alphanumeric mode. The moment you type a lowercase letter, the encoder falls back to byte mode, and you can see the bit stream grow longer. This is why many QR codes encode URLs in uppercase: it saves space.

Fun fact: the capacity of a Version 1-M QR code in each mode is 34 numeric, 20 alphanumeric, 14 byte, or 8 kanji characters. A typical short URL needs about Version 2 or 3.

Error correction

Here's what makes QR codes genuinely remarkable, and what separates them from a naive "just draw a grid of bits" approach: they still work when they're damaged.

A QR code printed on a sticker can lose a chunk of its surface (scratched, torn, covered by a logo, splashed with coffee) and still be perfectly scannable. This resilience comes from Reed-Solomon error correction, the same family of algorithms used in CDs, DVDs, satellite communications, and deep-space probes.

QR codes offer four levels of error correction, each trading data capacity for resilience:

Level Recovery capacity Typical use
L (Low)~7% of codewordsClean environments, maximum data
M (Medium)~15% of codewordsGeneral purpose (default)
Q (Quartile)~25% of codewordsIndustrial, outdoor use
H (High)~30% of codewordsLogos embedded, harsh conditions

The tradeoff is direct: higher error correction means more redundancy bytes, which means less room for actual data. A Version 1 code with EC level L can hold 17 data codewords (enough for a short string), but with level H, only 9. You're literally trading capacity for durability.

How Reed-Solomon works (intuitively)

Reed-Solomon encoding treats each data byte as a coefficient of a polynomial defined over the Galois field GF(256). This is a finite field with exactly 256 elements, which maps perfectly to the 256 possible values of a byte.

What makes GF(256) special is that addition, subtraction, multiplication, and division are all defined for its elements, and every non-zero element has a multiplicative inverse. Addition and subtraction are both XOR. Multiplication is done using log and antilog tables based on a primitive element $\alpha$:

$$a \times b = \alpha^{\log_\alpha(a) + \log_\alpha(b)}$$

Given $k$ data codewords, the encoder constructs a data polynomial:

$$f(x) = d_{k-1}x^{k-1} + d_{k-2}x^{k-2} + \cdots + d_1x + d_0$$

It then divides $f(x) \cdot x^n$ by a generator polynomial $g(x)$, which is constructed as the product of $n$ linear factors:

$$g(x) = \prod_{i=0}^{n-1}(x - \alpha^i) = (x - \alpha^0)(x - \alpha^1) \cdots (x - \alpha^{n-1})$$

where $\alpha = 2$ is a primitive element of GF(256) with the reducing polynomial $x^8 + x^4 + x^3 + x^2 + 1$ (hex value 0x11D). The remainder of this polynomial division gives us $n$ error correction codewords.

The mathematical guarantee: these $n$ EC codewords can correct up to $\lfloor n/2 \rfloor$ symbol errors, or detect up to $n$ errors, or some mix of both. The decoder uses syndrome computation, the Berlekamp-Massey algorithm, and Forney's formula to locate and fix errors, essentially solving a system of equations to find which bytes were corrupted and what they should have been.

Try it below. Increase the damage slider and watch modules flip randomly. Switch between error correction levels to see how much damage each can tolerate:

0%

With low error correction (L), the code becomes unreliable quickly. Switch to H and watch it hold on far longer, even at 25-30% damage. This is exactly why QR codes with logos embedded in the center typically use level H: the logo effectively destroys the modules underneath it, and the error correction compensates.

Masking

After placing data and error correction codewords into the grid, there's one more critical step that most people never think about: masking.

The problem is subtle. The encoded data might produce module patterns that confuse scanners. Imagine if the data happened to create a large region of all-black modules, or a pattern that looks like a finder or timing pattern in the wrong place. The scanner could misidentify structural elements, miscalculate grid alignment, or fail to decode entirely.

To prevent this, the QR specification defines 8 mask patterns. Each mask is a simple mathematical formula that determines, for any module position $(i, j)$, whether to flip that module (XOR it). The mask is only applied to data and EC modules. Finder patterns, timing patterns, alignment patterns, and format info are never touched.

The encoder applies all 8 masks, computes a penalty score for each result, and selects the mask with the lowest penalty. The penalty function checks four conditions:

The mask that produces the "most random-looking" distribution, the one least likely to confuse a scanner, wins. Here are all 8 mask patterns:

Click a mask pattern to see it applied to a QR code. Green modules are flipped.

Each mask has an elegant formula. Mask 0, $(i + j) \bmod 2 = 0$, creates a checkerboard. Mask 1, $i \bmod 2 = 0$, creates horizontal stripes. They range from simple to complex, but each produces a distinct visual pattern that interacts differently with different data.

The mask number (0-7) is encoded in the format information near the finder patterns, so the scanner knows which mask to "undo" before reading the data.

Putting it all together

Let's trace the complete pipeline from input text to finished QR code, using "guidavid.com" as our example. Step through each stage:

1. Analyze the input

And that's it. Every black and white module in the final code is precisely determined by math. There's no randomness, no approximation. Given the same input text and EC level, the same QR code is produced every time. Try the generator below:

The QR code above is real and scannable. Point your phone camera at it. The code running on this page uses the qrcodegen reference library by Project Nayuki, which implements the full QR specification: Galois field arithmetic, Reed-Solomon encoding, module placement, all 8 mask patterns, and penalty scoring.

You point your phone at a sticker. The screen loads a menu. You order a coffee. Between the camera and the URL, your phone ran polynomial division over a finite field, decoded a zigzag, unmasked eight candidate patterns, and corrected for the scratch across the bottom-left corner. You didn't notice.

The best engineering disappears.