A 15 MB PDF is too large for most email attachments (typically capped at 25 MB). A 50 MB presentation deck takes forever to load on mobile. PDF compression solves this, reducing file size by 50-80% while keeping the document perfectly readable.
How PDF compression works
PDFs contain several types of data, each with different compression potential:
Images (biggest savings)
In image-heavy and scanned PDFs, raster images are the dominant contributor to file size. EverMap's analysis of PDF structure confirms that images consistently dominate in photo-heavy documents, while embedded fonts can take over in text-focused layouts. Compression targets images first:
- Downscaling, a 300 DPI image displayed at 150 DPI has 4x more pixels than needed. Downscaling to match the display resolution cuts size dramatically.
- Re-encoding, lossless PNG images can be re-encoded as JPEG with 85% quality. At that quality setting, JPEG achieves 60-80% file size reduction with visually imperceptible difference at normal viewing sizes. For PNG-to-JPEG specifically (lossless-to-lossy), gains can be even greater.
- Color space, CMYK uses 4 color channels (32 bits/pixel) vs RGB's 3 channels (24 bits/pixel), so converting CMYK to RGB reduces raw image data by ~25%, which compounds with other compression steps.
Fonts
PDF fonts can contain thousands of glyphs. Font subsetting strips unused characters, if your document only uses 200 of 2,000 glyphs, the font file shrinks by 90%. Google's web.dev guide on font optimization documents real-world reductions in that same range.
Metadata and structure
- Thumbnail previews (unnecessary for digital use)
- XMP metadata blocks
- Duplicate objects and unreferenced data
- Cross-reference table optimization
Typical compression results
| Document Type | Original | Compressed | Reduction |
|---|---|---|---|
| Scanned document (images) | 15 MB | 2-3 MB | 80% |
| Photo-heavy report | 25 MB | 5-8 MB | 70% |
| Text with charts | 5 MB | 1-2 MB | 65% |
| Pure text document | 500 KB | 350 KB | 30% |
These are approximate ranges based on typical real-world documents compressed at
/ebookquality (150 DPI). Results vary significantly by original image resolution, scan settings, and whether images were already compressed.
Pure text documents see smaller gains because text is already compact. The biggest wins come from image-heavy PDFs.
Method 1: Browser-based compression (recommended)
The fastest way to compress a PDF with zero setup:
- Go to Compress PDF
- Drop your file onto the upload area
- Wait for processing (typically 2-10 seconds)
- Download the compressed file
Why this works well:
- Processing happens in your browser, files never leave your device
- No signup, no watermarks, no limits
- Uses
pdfjs-distto render pages andpdf-libto reconstruct the PDF
Method 2: Ghostscript (command line)
Ghostscript is the most powerful free PDF compressor:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 \
-dNOPAUSE -dBATCH -dQUIET \
-dPDFSETTINGS=/ebook \
-sOutputFile=compressed.pdf input.pdfQuality presets
These presets are defined in the official Ghostscript documentation:
| Preset | DPI | Use Case |
|---|---|---|
/screen | 72 | Smallest file, screen-only viewing |
/ebook | 150 | Good balance for email/web (recommended) |
/printer | 300 | High quality for printing |
/prepress | 300+ | Maximum quality, minimal compression |
# Maximum compression (screen quality)
gs -sDEVICE=pdfwrite -dPDFSETTINGS=/screen \
-dNOPAUSE -dBATCH \
-sOutputFile=small.pdf large.pdf
# Balanced (ebook quality)
gs -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook \
-dNOPAUSE -dBATCH \
-sOutputFile=medium.pdf large.pdfMethod 3: Python script
import subprocess
def compress_pdf(input_path, output_path, quality='ebook'):
"""Compress PDF using Ghostscript."""
subprocess.run([
'gs', '-sDEVICE=pdfwrite',
f'-dPDFSETTINGS=/{quality}',
'-dNOPAUSE', '-dBATCH', '-dQUIET',
f'-sOutputFile={output_path}',
input_path
], check=True)
compress_pdf('report.pdf', 'report-compressed.pdf')Understanding quality trade-offs
Lossless vs lossy compression
Lossless compression rearranges data without removing anything. It typically achieves 10-30% reduction. Techniques include:
- Removing duplicate objects
- Optimizing the cross-reference table
- Compressing streams with Flate/Deflate
Lossy compression removes data that's less perceptible. It achieves 50-90% reduction by:
- Downscaling images
- Re-encoding images as JPEG
- Reducing color depth
Most compression tools use a combination of both.
When quality matters
- Email attachments: use aggressive compression (
/ebookor/screen). Recipients are viewing on screen. - Print documents: use light compression (
/printer). Preserve 300 DPI for sharp output. - Archival: use lossless only. Don't degrade the original.
- Web viewing: aggressive compression is fine. Browser rendering handles lower DPI well.
Tips for smaller PDFs
1. Compress images before adding them
If you're creating a PDF from images (like scanned documents), optimize the images first. A 4000x3000 JPEG at 85% quality is usually sufficient for A4 printing.
2. Use vector graphics when possible
Charts, diagrams, and logos as SVG/vector are 10-100x smaller than rasterized versions and scale perfectly.
3. Subset fonts
If you're generating PDFs programmatically, always subset fonts. The full Inter typeface weighs ~300KB. Subsetting to only the glyphs used in a document typically brings that down to 15-30KB, a 90%+ reduction consistent with what web.dev measures for real-world font optimization.
4. Compress after merging
If you merged multiple PDFs, compress the result. Merged files often contain duplicate font data and unoptimized image streams.
5. Remove unnecessary pages
Before compressing, split out pages you don't need. Fewer pages = smaller file, obviously.
Avoiding compression entirely
Compression is a fix for PDFs that were generated with unnecessary overhead: unoptimized images, full font sets, embedded thumbnails. If you're generating PDFs programmatically and need them lean from the start, the better approach is to optimize the source.
PDF4.dev uses Playwright's Chromium renderer, which produces compact PDF output by default:
- Fonts are subsetted automatically (only characters used in the document are embedded)
- Images are embedded at their native resolution (no accidental upscaling)
- No embedded thumbnails or preview metadata
If you're generating invoices, reports, or certificates and finding yourself compressing the output afterward, the issue is usually in how the HTML was structured, not in the PDF itself. Common causes: base64-embedded images that are larger than displayed, CSS backgrounds that include high-res textures, or fonts loaded from CDN without subsetting.
Generate lean PDFs from the start with a well-structured template, and compression becomes an optional step rather than a required one.
PDF4.dev has a free tier. Try the template editor to generate compact, production-ready PDFs without post-processing.
FAQ
Will compression make my PDF look blurry?
At the default /ebook quality (150 DPI), text and most images remain sharp on screen. You'd only notice quality loss if you zoom in to 400%+ or print at large format. For email and web, the difference is imperceptible.
Can I compress a password-protected PDF?
You need to remove the password first, then compress, then re-protect if needed.
Why is my compressed file larger than the original?
This can happen with already-optimized PDFs or pure text documents. The compression process adds its own overhead (new cross-reference tables, etc.) that can exceed the savings. If this happens, just use the original.
What's the maximum compression I can achieve?
For image-heavy documents (scanned pages, photos), 80-90% reduction is common. For text-heavy documents, 30-50% is typical. Pure text with no images: 10-20% at best.
Free tools mentioned:
Start generating PDFs
Build PDF templates with a visual editor. Render them via API from any language in ~300ms.