A bloated PDF is annoying to email, slow to upload, and sometimes outright rejected by portals with strict size limits. Compression fixes this, but not all compression works the same way — understanding the difference between lossless and lossy approaches helps you pick the right setting for the job.
Where PDF File Size Actually Comes From
Before compressing anything, it helps to know what's taking up space:
- Embedded images are usually the biggest contributor, especially scanned pages saved as high-resolution photos.
- Embedded fonts can add meaningful size, particularly if a document embeds full font sets rather than just the characters used.
- Redundant or unused objects, like duplicate images, unused annotations, or leftover metadata from editing software.
Lossless Compression: No Quality Trade-off
Lossless techniques reduce file size without touching the visual quality of anything in the document:
- Removing duplicate embedded resources (the same image referenced multiple times, stored once)
- Subsetting fonts (keeping only the characters actually used instead of the entire font file)
- Stripping unnecessary metadata, thumbnails, and editing history
- Optimizing the internal PDF structure and compression of the underlying data streams
These steps alone can meaningfully shrink a PDF, especially one produced by office software that tends to leave extra data behind.
Lossy Compression: Trading Quality for Size
When lossless techniques aren't enough — commonly true for PDFs made of scanned images — lossy compression re-encodes embedded images at a lower quality or resolution:
- Reducing image resolution (downsampling), since most documents don't need print-resolution (300 DPI) images for on-screen viewing
- Increasing JPEG compression on embedded photos, trading some visual fidelity for smaller size
- Converting to more efficient image formats where supported
The key is that this trade-off is tunable. A quality setting aimed at "readable on screen" can shrink a scanned PDF dramatically with barely visible quality loss, while a setting aimed at "smallest possible file" will show visible artifacts.
Practical Guidance
- Start with a moderate compression setting rather than the most aggressive one, and check the output before committing to a smaller target.
- For text documents, lossless compression is usually sufficient — you rarely need to touch image quality at all.
- For scanned documents, some quality loss is normal and expected — the goal is finding the setting where the document is still perfectly legible.
- Compress the final version, not each draft. If you're going to edit or merge a document further, do compression as the last step so you're not repeatedly re-compressing already-compressed images.
Quick Workflow
- Upload your PDF to a compression tool.
- Choose a compression level (often labeled something like "low," "recommended," or "extreme").
- Compare the output size against your target (e.g., under 5MB for an email attachment).
- Open the compressed file and check a few pages, particularly any with images or fine print, before sending it.