Human Vision Fundamentals
The human retina contains two types of light-detecting cells: rods and cones. Rods detect brightness (luminance) and vastly outnumber cones — approximately 120 million rods versus only 6 million cones. Cones detect color (chrominance) and come in three subtypes (red, green, blue), each responding to a different wavelength of light.
This 20:1 ratio has a profound consequence: your visual system resolves brightness at far higher spatial detail than it resolves color. You can detect a sharp brightness edge (black text on white paper) down to about 1 arcminute of visual angle. But for a color-only edge (red text on green background at equal brightness), your resolution drops dramatically — you need roughly 4× the size to detect the same level of detail.
This biological fact is the foundation of virtually all lossy image and video compression. JPEG, MPEG, H.264, HEVC, AV1 — they all exploit the same principle. Reduce color resolution while keeping brightness resolution intact, and the human eye doesn't notice. The technical name for this technique is chroma subsampling.
YCbCr Color Space
Your computer screen works in RGB (Red, Green, Blue) color space, where each pixel gets three values that define its color by mixing light. But JPEG doesn't compress RGB directly. Before compression begins, JPEG converts every pixel from RGB to YCbCr:
- Y (Luma) — the brightness component. This carries the luminance information: how light or dark each pixel is. If you extracted just the Y channel, you'd see a perfect grayscale version of the image.
- Cb (Chroma Blue) — the blue-yellow color difference. This describes how much the pixel's color deviates from gray in the blue/yellow direction.
- Cr (Chroma Red) — the red-green color difference. This describes the deviation in the red/green direction.
This separation is the key insight. Once brightness and color are in separate channels, you can treat them independently. Keep Y at full resolution for sharp edges and fine detail. Reduce Cb and Cr resolution because the eye won't notice. That's chroma subsampling.
Why not just compress RGB? In RGB, brightness information is spread across all three channels. You can't reduce color resolution without also reducing brightness resolution. YCbCr cleanly separates the two, enabling selective compression that matches human perception.
The Notation System: What 4:4:4, 4:2:2, and 4:2:0 Mean
Chroma subsampling uses a three-number notation J:a:b that describes how color samples are distributed in a reference block. The notation is notoriously confusing, but here's the practical meaning:
4:4:4 — No Subsampling (Full Color)
Every single pixel gets its own unique luminance (Y) value and its own unique color (Cb, Cr) values. Nothing is reduced. This produces the highest quality output and the largest files. For every 4 pixels in a row, there are 4 luma samples and 4 chroma samples.
4:2:2 — Horizontal Subsampling
Every pixel keeps its own luminance value, but color resolution is halved horizontally. In each row of 4 pixels, there are 4 luma samples but only 2 chroma samples. Each pair of horizontally adjacent pixels shares one color value. Vertical color resolution remains unchanged.
4:2:0 — Horizontal + Vertical Subsampling
Every pixel keeps its own luminance value, but color resolution is halved both horizontally and vertically. A 2×2 block of four pixels shares a single color value. This is the most aggressive common subsampling scheme — and the most widely used, because it provides dramatic file size savings with negligible visual impact on photographs.
| Subsampling | Color Resolution | Data Reduction | File Size vs 4:4:4 | Best For |
|---|---|---|---|---|
| 4:4:4 | Full (1:1) | None | Baseline (largest) | Text, graphics, screenshots |
| 4:2:2 | Halved horizontally | 33% less chroma data | ~10–15% smaller | Professional video, broadcast |
| 4:2:0 | Halved both ways (2×2 block) | 75% less chroma data | ~25–33% smaller | Photos, web images, consumer video |
The "data reduction" column refers to the chrominance channels only. Since the luminance channel (which carries the most visual information) is never subsampled, the total file size reduction is less than the raw chroma reduction suggests. In practice, switching from 4:4:4 to 4:2:0 saves about 25–33% of total file size at the same quality setting.
Visual Impact: When It Matters and When It Doesn't
The visual impact of chroma subsampling depends entirely on the content of the image. For some images, 4:2:0 is indistinguishable from 4:4:4. For others, the difference is clearly visible.
Photographs: 4:2:0 Is Virtually Invisible
Natural photographs contain smooth color gradients — sky, skin, foliage, water, fabric. These gradients have no sharp color edges for the eye to detect. When you reduce color resolution by half in both dimensions, the color transitions remain smooth because the original gradients were smooth. In side-by-side comparisons, even trained photographers struggle to distinguish 4:2:0 from 4:4:4 in typical photos.
This is why every consumer camera, every social media platform, and every web browser uses 4:2:0 for JPEG photos by default. The quality trade-off is overwhelmingly favorable: 25–33% smaller files with no visible difference.
Text and Graphics: 4:2:0 Creates Visible Artifacts
Screenshots, UI mockups, logos, diagrams, and text-heavy images are a different story. These contain sharp color transitions — a red letter on a white background, a green button with white text, a blue line on a gray grid. When 4:2:0 subsampling averages the color across a 2×2 pixel block, these sharp edges get blurred.
The result is color fringing — a visible smear of color bleeding beyond the edges of sharp elements. Black text on a white background is usually fine because the contrast is purely in the luminance channel (black and white differ only in brightness). But colored text on a contrasting background is where 4:2:0 artifacts become obvious.
| Content Type | 4:4:4 | 4:2:2 | 4:2:0 |
|---|---|---|---|
| Landscape photos | Perfect | Perfect | Perfect |
| Portraits / skin tones | Perfect | Perfect | Perfect |
| Product photos | Perfect | Perfect | Perfect |
| Black text on white | Perfect | Perfect | Good |
| Colored text / UI elements | Perfect | Slight fringing | Visible fringing |
| Code screenshots | Perfect | Slight fringing | Noticeable |
| Logos with thin colored lines | Perfect | Minor artifacts | Color bleeding |
File Size Impact
The file size savings from chroma subsampling are substantial and consistent. In an uncompressed image, each pixel has three components: Y, Cb, and Cr. At 4:4:4, all three are stored at full resolution. At 4:2:0, the Cb and Cr channels each contain only 25% of the samples — a 75% reduction in chrominance data.
Since the chrominance channels represent roughly one-third to one-half of the total data (depending on image content and compression), the practical file size impact is:
| Subsampling | Typical File Size | Example (10 MP photo at Q85) |
|---|---|---|
| 4:4:4 | Baseline (100%) | ~3.2 MB |
| 4:2:2 | ~85–90% | ~2.8 MB |
| 4:2:0 | ~67–75% | ~2.3 MB |
For a website serving thousands of images, the difference between 4:4:4 and 4:2:0 adds up dramatically. A photo gallery with 100 images saves roughly 90 MB of bandwidth by using 4:2:0 instead of 4:4:4, with no visible quality difference for photographic content.
Compound with quality: Chroma subsampling savings compound with JPEG quality reduction. At Q85 with 4:2:0, a photo is roughly 5–8x smaller than the original PNG — the quality setting and subsampling work together to reduce file size far beyond what either achieves alone.
ImageMagick Behavior and Defaults
ImageMagick's JPEG encoder automatically selects chroma subsampling based on the -quality setting. This is one of the most commonly overlooked aspects of JPEG compression with ImageMagick:
| Quality Setting | Default Subsampling | Rationale |
|---|---|---|
| Q90–Q100 | 4:4:4 | High quality requested → preserve full color |
| Q1–Q89 | 4:2:0 | Lower quality → aggressive compression |
This automatic switch means that setting quality to 90 vs 89 produces a disproportionately large file size jump. It's not just one quality step — it's the quality step plus the switch from 4:2:0 to 4:4:4 subsampling. Many users notice that Q90 files are surprisingly larger than Q89 files and assume ImageMagick is broken. It's not — the subsampling change is the culprit.
Overriding the default
You can explicitly control chroma subsampling with the -sampling-factor flag:
# 4:4:4 (no subsampling — best for text/graphics)
convert input.png -quality 85 -sampling-factor 1x1 output.jpg
# 4:2:2 (balanced — used in professional video)
convert input.png -quality 85 -sampling-factor 2x1 output.jpg
# 4:2:0 (smallest — best for photos)
convert input.png -quality 85 -sampling-factor 2x2 output.jpg
The sampling factor uses the format HxV where H is horizontal sampling and V is vertical sampling. 1x1 means no subsampling (4:4:4), 2x1 means halved horizontally (4:2:2), and 2x2 means halved both ways (4:2:0).
Q85 + 4:4:4: A common technique for screenshots and text-heavy images is -quality 85 -sampling-factor 1x1. This gives you file size savings from the quality setting while preserving full color resolution for sharp text. Without -sampling-factor 1x1, Q85 defaults to 4:2:0 and colored text will have visible fringing.
Recommendations by Content Type
Here's a practical decision guide for choosing the right chroma subsampling:
Use 4:2:0 When:
- Photographs of any kind — landscapes, portraits, product shots, food, architecture
- Web images — hero images, blog illustrations, social media posts
- Email attachments — file size reduction helps with attachment limits
- Thumbnails and previews — small display size masks any possible artifacts
- Any image where file size matters more than pixel-perfect color edges
Use 4:4:4 When:
- Screenshots with colored text — syntax-highlighted code, colored UI elements
- Logos and brand assets — precise color boundaries matter for brand integrity
- Technical diagrams — colored lines and labels need sharp edges
- UI mockups and design files — pixel-perfect color reproduction required
- Medical or scientific imaging — color accuracy is critical
Use 4:2:2 When:
- Professional video production — broadcast standard for studio content
- Mixed content images — photos with text overlays where 4:4:4 is overkill but 4:2:0 creates visible fringing
- High-end photography where the absolute maximum quality is needed but 4:4:4 file sizes are impractical
When in doubt: For photographs, use 4:2:0 without hesitation. For anything with text or sharp colored edges, use 4:4:4. The 4:2:2 middle ground is rarely needed for still images — it's primarily a video production format.
Chroma Subsampling Beyond JPEG
Chroma subsampling isn't unique to JPEG. Nearly all lossy visual media uses it:
| Format / Codec | Common Subsampling | Notes |
|---|---|---|
| JPEG | 4:2:0 (default), 4:4:4 | Encoder-dependent default |
| WebP | 4:2:0 (lossy mode) | No option for 4:4:4 in lossy WebP |
| AVIF | 4:2:0, 4:4:4 | 4:4:4 available but increases file size |
| H.264 / H.265 video | 4:2:0 (consumer), 4:2:2 (pro) | 4:4:4 only in specialized profiles |
| Blu-ray / Streaming | 4:2:0 | Standard for all consumer video |
| PNG | None (lossless) | Full RGB preserved, no subsampling |
WebP is notable for not offering 4:4:4 in its lossy mode. If you have a screenshot with colored text and need the smallest file with sharp colors, lossy WebP may not be ideal — use JPEG at 4:4:4, PNG (lossless), or lossless WebP instead.
AVIF, the newest contender, supports 4:4:4 and combines it with more efficient compression than JPEG. For text-heavy images that must be lossy-compressed, AVIF with 4:4:4 can produce smaller files than JPEG 4:4:4 at the same visual quality.