How QR Codes Work: Error Correction Explained

How QR Codes Work: The Engineering Behind Every Scan

You have almost certainly scanned a QR code this week. Restaurant menus, event tickets, product packaging, and payment screens all rely on those familiar black-and-white squares. But have you ever wondered why you can place a company logo in the center of a QR code and it still scans perfectly? The answer sits deep inside a branch of mathematics called Reed-Solomon error correction, and understanding it reveals just how elegantly engineered these symbols really are.

This article explains how QR codes work from the ground up: their origin, their anatomy, the four encoding modes, error correction levels, mask patterns, and why some QR codes refuse to scan. No prior engineering knowledge required.

A Brief Origin Story

QR codes were invented in 1994 by Masahiro Hara, an engineer at Denso Wave, a Toyota subsidiary. The original use case was humble: tracking automotive parts on a factory floor. Bar codes at the time could only hold around 20 alphanumeric characters and required precise alignment with the scanner. Hara's team needed something that could hold a full part number, be read from any angle, and decode roughly ten times faster than a standard barcode.

The name "QR" stands for Quick Response, a nod to that speed goal. Denso Wave released the standard publicly and chose not to enforce their patent, which is why QR codes spread globally without licensing friction.

The Anatomy of a QR Code

A QR code is not a random arrangement of black and white squares. Every region has a specific purpose.

Finder Patterns

The three large finder patterns, the square-within-a-square symbols in three corners of every QR code, are the first thing a scanner looks for. Their distinctive 1:1:3:1:1 ratio of dark-light-dark-light-dark modules can be detected at any rotation angle, from any distance, and even when the code is slightly distorted. They tell the scanner: here is the code, here is its orientation, and here is its scale.

A fourth corner is left empty intentionally. This asymmetry tells the scanner which way is up.

Alignment Patterns

Larger QR codes (Version 2 and above) include smaller alignment patterns scattered across the symbol. Their job is to correct for curved or distorted surfaces. If you have ever scanned a QR code printed on a cylindrical bottle or embossed on a rounded surface, alignment patterns are what made that possible.

Timing Patterns

Running between the finder patterns are timing patterns: alternating single-module-wide strips of black and white. They function like a ruler, allowing the scanner to count rows and columns precisely, even if the image is slightly skewed.

Format Information

Two strips wrapping around the finder patterns store format information: the error correction level and the mask pattern in use. Critically, this data is encoded using a (15,5) BCH code, a type of error-correcting code, so the format can be read even if large portions of the main symbol are damaged. Before the scanner can decode any data, it must read the format information.

Data Modules

The rest of the symbol consists of data modules: the actual encoded payload. Data is stored in 8-bit bytes and read in a specific zigzag pattern, starting from the bottom-right corner and moving upward in two-module-wide columns. The reading order avoids the functional regions (finder patterns, alignment patterns, timing strips, and format strips).

The Four Encoding Modes

QR codes support four ways to represent data, each with a different character set and capacity. At Version 1 (the smallest possible QR code, 21x21 modules) with Error Correction Level M, the limits are:

Mode	Character Set	Capacity
Numeric	Digits 0-9	34 digits
Alphanumeric	0-9, A-Z, space, and 9 symbols	20 characters
Byte	Any 8-bit data, including UTF-8	14 bytes
Kanji	Japanese double-byte characters	8 characters

Encoders choose the mode (or mix of modes) that packs the most data into the fewest modules. A URL like https://morefreetools.com uses Byte mode because it contains lowercase letters, which are not in the Alphanumeric set.

Reed-Solomon Error Correction: Why QR Codes Survive Damage

This is the heart of the article. Understanding Reed-Solomon error correction explains the logo trick, and much more.

Reed-Solomon is not unique to QR codes. The same algorithm protects data on CDs, DVDs, Blu-ray discs, storage drives using RAID arrays, and even signals beamed back from the Voyager probes in deep space. Every time you play a scratched CD and hear no glitches, Reed-Solomon is working in the background.

How It Works (Conceptually)

Think of a RAID-5 disk array. If you have four drives and one fails completely, the RAID system can reconstruct every byte that was on the failed drive using parity data spread across the other three. No data is lost even though an entire drive is gone.

Reed-Solomon works on a similar principle, but mathematically. The encoder treats the data bytes as coefficients of a polynomial. It then evaluates that polynomial at a set of additional points and appends those values to the message as check codewords. The encoded message is now longer than the original data, but it carries enough redundancy that a decoder can reconstruct the original polynomial, and therefore the original data, even if some codewords have been corrupted or erased entirely.

Crucially, Reed-Solomon does not merely detect errors. It corrects them. As long as the number of damaged codewords stays within the error correction budget, the original data is fully recoverable with no loss.

The Four Error Correction Levels

QR codes offer four error correction levels. Higher levels sacrifice capacity (more modules are used for check codewords) in exchange for resilience.

Level	Name	Recovery Capacity
L	Low	7% of codewords
M	Medium	15% of codewords
Q	Quartile	25% of codewords
H	High	30% of codewords

Why QR Codes with Logos Work

This is the key insight. When a designer places a logo in the center of a QR code, the logo physically covers data modules. From the scanner's perspective, those modules are simply missing, exactly like a failed drive in the RAID analogy.

If the QR code was generated at Level H, up to 30% of the symbol can be obliterated and the data is still fully recoverable. The logo replaces data that the Reed-Solomon layer can reconstruct from the check codewords that remain. This is not a hack or an exploit of some oversight. It is intentional design: the standard was built with exactly this kind of physical damage in mind. Denso Wave was thinking about barcodes printed on dirty factory floors, not about marketing teams adding logos, but the mathematics does not care about intent.

The practical rule: if your QR code will have a logo, always generate it at Level H. Keep the logo covering less than 30% of the symbol area, and center it to avoid obscuring the finder patterns in the corners.

Mask Patterns: Preventing Scanner Confusion

After data codewords are placed in the symbol, the encoder applies a mask pattern. There are 8 standard masks, each defined as a simple XOR formula applied to every data module. For example, mask pattern 0 flips every module at a position where (row + column) mod 2 == 0.

Why? Scanners can struggle with large uniform regions of the same color. A block of white or a block of black confuses algorithms that look for the specific ratios of the finder patterns. Masking breaks up those regions by introducing a controlled alternating pattern.

The encoder tries all 8 masks and evaluates each result against 4 penalty rules that penalize things like: runs of same-color modules longer than 5 in a row, 2x2 blocks of the same color, patterns that resemble finder patterns, and unbalanced ratios of dark-to-light modules. The mask with the lowest total penalty score is the one used in the final symbol.

Version and Capacity: From Version 1 to Version 40

QR codes come in 40 versions. Version 1 is 21x21 modules. Each version increment adds 4 modules to both the width and the height, so Version 40 is 177x177 modules.

At Version 40 with Level L error correction (the least redundancy, the most capacity):

7,089 numeric digits
4,296 alphanumeric characters
2,953 bytes (about 2.9 KB of binary data)
1,817 Kanji characters

Higher error correction levels reduce these capacities because more modules are consumed by check codewords. The QR code generator you choose will automatically select the minimum version that fits your data at the chosen error correction level.

Why Some QR Codes Fail to Scan

Not every QR code scans on the first try. Here are the most common causes of failure:

Insufficient contrast. A QR code needs strong contrast between dark and light modules. Printing dark gray on medium gray, or using a very light color on white, reduces the contrast ratio below what most scanners can handle. The standard recommends a minimum contrast ratio of 4:1.

Missing or narrow quiet zone. Every QR code must be surrounded by a blank border called the quiet zone, at least 4 modules wide on all sides. Printing the code too close to other design elements, or all the way to the edge of a label, eliminates the quiet zone and causes scan failures.

Too much data for the version. If the encoder is forced to use a very small version with a high error correction level to keep the code compact, the modules become very small. Printed at small physical sizes, the modules may be too fine for a phone camera to resolve.

Reflective surfaces. Printing a QR code on glossy paper, metal, or glass can cause glare that washes out portions of the symbol. Matte finishes are far more reliable for scanning.

Extreme scan angle. Scanning at an angle greater than about 45 degrees introduces enough perspective distortion that the finder patterns and alignment patterns can fail to compensate. Keep the scan as close to perpendicular as practical.

Damaged or dirty symbol. Beyond the error correction budget, physical damage truly does prevent decoding. A QR code at Level H can survive 30% damage; a code at Level L can only survive 7%.

Dynamic vs. Static QR Codes

There is an important practical distinction between two types of QR codes you encounter in the wild.

A static QR code encodes the final destination URL (or other data) directly in the symbol. If the URL changes, you must generate and reprint a new QR code entirely. Static codes are simple and have no ongoing infrastructure dependencies.

A dynamic QR code encodes a short redirect URL (often hosted by a QR code service). When someone scans it, their phone fetches the redirect URL, which points to the actual destination. The actual destination can be changed in the service dashboard at any time without regenerating or reprinting the QR code. Dynamic codes are useful for print campaigns where the destination may need updating after the materials have been distributed.

Note that dynamic codes depend on the redirect service remaining operational. A static code works forever, as long as the destination URL does.

Frequently Asked Questions

Why does a QR code with a logo still scan?

A logo placed over a QR code physically destroys some data modules. However, QR codes generated at Error Correction Level H can recover up to 30% of damaged codewords using Reed-Solomon mathematics. The logo essentially triggers the same recovery mechanism as physical damage. As long as the logo covers less than 30% of the symbol and is centered to avoid the corner finder patterns, the code will scan.

What is the Reed-Solomon algorithm in simple terms?

Reed-Solomon is a mathematical technique for adding structured redundancy to data. The encoder represents your data as a polynomial and computes additional check values at extra points on that polynomial. If some data is later lost or corrupted, the decoder uses those extra check values to mathematically reconstruct the original polynomial, recovering the lost data. The same technique protects CDs, DVDs, and spacecraft telemetry.

What is the maximum amount of data a QR code can hold?

A Version 40 QR code at Error Correction Level L can hold up to 7,089 numeric digits, 4,296 alphanumeric characters, or 2,953 bytes of binary data. In practice, most QR codes encode short URLs and use far less than the maximum capacity, which allows a smaller version and easier scanning.

Should I use Level H error correction for all QR codes?

Not necessarily. Level H produces a denser, larger symbol for the same data payload because more modules are used for check codewords. If your QR code will not have a logo and will be printed cleanly at a reasonable size, Level M (15% recovery) is usually sufficient and produces a more scannable code. Use Level H when adding a logo, printing on a potentially dirty surface, or when the code must be very small.

What is the quiet zone and why does it matter?

The quiet zone is the blank white border surrounding the QR code symbol. It must be at least 4 modules wide on every side. Without the quiet zone, a scanner cannot determine where the symbol ends and the surrounding design begins. The finder patterns rely on detecting a transition from white (quiet zone) to the first dark module. Eliminate the quiet zone and even a perfectly formed QR code will fail to scan reliably.

Ready to generate a QR code with all of this knowledge in hand? Use the QR Code Generator on MoreFreeTools to create codes at any error correction level, add a logo, and download a print-ready file.