PDF/A is a subset of the PDF format standardized by ISO for long-term document archival. A PDF/A file contains everything needed to display it exactly the same way indefinitely: all fonts, color profiles, and metadata are embedded, and no features that require external resources or runtime behavior are allowed. This guide covers what distinguishes each PDF/A version, what the standard technically prohibits, and how to create and validate compliant files.
What PDF/A actually means
PDF/A stands for "PDF for Archiving." The standard is published as ISO 19005 and has three main parts (PDF/A-1, -2, -3), each released as a separate ISO document.
The design goal is reproducibility: a PDF/A file opened in a PDF viewer in 2076 should look identical to how it looked in 2026, regardless of what fonts, color profiles, or software versions are installed on the system. To achieve this, the standard removes any feature that creates external dependencies.
What PDF/A is not
PDF/A is not a compression format. Converting to PDF/A does not automatically make files smaller. PDF/A is also not the same as PDF/X (which is optimized for print production) or PDF/UA (which is optimized for accessibility, though PDF/A-a conformance includes accessibility tagging).
PDF/A versions compared
There are three ISO standards, each based on a different base PDF version:
| Version | Based on | Released | Key additions over previous |
|---|---|---|---|
| PDF/A-1 | PDF 1.4 | 2005 | First version; no transparency, LZW, or embedded files |
| PDF/A-2 | PDF 1.7 | 2011 | Allows transparency, JPEG 2000, PDF/A file attachments |
| PDF/A-3 | PDF 1.7 | 2012 | Allows any file type as attachment (XML, CSV, HTML) |
Each version also has conformance levels:
| Level | What it requires | Availability |
|---|---|---|
/b (basic) | Embedded fonts + ICC profiles. No structural tagging required | PDF/A-1, -2, -3 |
/a (accessible) | Everything in /b plus logical reading order and tagged content | PDF/A-1, -2, -3 |
/u (unicode) | Everything in /b plus Unicode text mappings for all text | PDF/A-2, -3 only |
When to use which:
- PDF/A-1b: maximum compatibility with older PDF processors. Required by German courts for legal filings (ZPO), and by many government archive systems that were implemented before 2011.
- PDF/A-2b: the best default for new documents. Allows CSS transparency (box shadows, opacity), JPEG 2000 for smaller image files, and embedded PDF/A attachments.
- PDF/A-3b: use when the PDF needs to carry machine-readable structured data, such as ZUGFeRD/Factur-X XML invoices embedded alongside the visual document.
Technical restrictions in PDF/A
PDF/A compliance is defined by what the file is not allowed to contain. The restrictions vary by version:
| Feature | PDF/A-1 | PDF/A-2 | PDF/A-3 |
|---|---|---|---|
| JavaScript | Forbidden | Forbidden | Forbidden |
| Encryption | Forbidden | Forbidden | Forbidden |
| External fonts (not embedded) | Forbidden | Forbidden | Forbidden |
| Audio/video content | Forbidden | Forbidden | Forbidden |
| Transparency layers | Forbidden | Allowed | Allowed |
| LZW compression | Forbidden | Allowed | Allowed |
| JPEG 2000 | Forbidden | Allowed | Allowed |
| Embedded non-PDF files | Forbidden | Forbidden (PDF/A only) | Allowed (any type) |
| XMP metadata declaration | Required | Required | Required |
| ICC color profiles | Required | Required | Required |
Every font used in the document must be fully embedded. Partial embedding (subsetting) is allowed as long as the subset contains all glyphs used in the document.
XMP metadata is mandatory: the file must include a block declaring its PDF/A version and conformance level, otherwise validators will report it as non-compliant regardless of content.
Who requires PDF/A
PDF/A compliance is mandated in specific regulatory and industry contexts:
| Context | Version required | Regulation |
|---|---|---|
| German court filings | PDF/A-1 | ZPO Section 130a, ERVV |
| EU qualified electronic signatures | PDF/A-1 or -2 | eIDAS Regulation (EU 910/2014) |
| German federal archives | PDF/A-1b or -2b | BArchG, KGSt recommendations |
| ZUGFeRD/Factur-X e-invoices | PDF/A-3 | ZUGFeRD 2.0 specification |
| US federal agencies (e-discovery) | PDF/A-1 | Agency-specific records management policies |
| Swiss administrative proceedings | PDF/A-1 | VwVG, Justizdirektionen guidance |
Outside regulated contexts, PDF/A is a good default for any document meant to be retained for more than a few years: legal contracts, official certificates, government-issued documents, and financial reports.
How to create PDF/A files with Ghostscript
Ghostscript is the most widely used open-source tool for creating PDF/A files. It can convert any existing PDF to PDF/A-1b or PDF/A-2b.
Convert to PDF/A-1b
gs \
-dPDFA=1 \
-dBATCH \
-dNOPAUSE \
-sDEVICE=pdfwrite \
-sProcessColorModel=DeviceRGB \
-dCompatibilityLevel=1.4 \
-dPDFACompatibilityPolicy=1 \
-sOutputFile=output-pdfa1b.pdf \
input.pdf-dPDFA=1 sets the target to PDF/A-1. -dPDFACompatibilityPolicy=1 tells Ghostscript to convert incompatible features rather than abort. Without this flag, a source PDF with transparency will cause an error instead of being converted.
Convert to PDF/A-2b
gs \
-dPDFA=2 \
-dBATCH \
-dNOPAUSE \
-sDEVICE=pdfwrite \
-sProcessColorModel=DeviceRGB \
-dCompatibilityLevel=1.7 \
-dPDFACompatibilityPolicy=1 \
-sOutputFile=output-pdfa2b.pdf \
input.pdfThe only difference is -dPDFA=2 and -dCompatibilityLevel=1.7.
Image quality during conversion
Ghostscript resamples images during conversion. To preserve image quality:
gs \
-dPDFA=2 \
-dBATCH -dNOPAUSE \
-sDEVICE=pdfwrite \
-sProcessColorModel=DeviceRGB \
-dCompatibilityLevel=1.7 \
-dPDFACompatibilityPolicy=1 \
-dPDFSETTINGS=/prepress \
-sOutputFile=output-high-quality.pdf \
input.pdf-dPDFSETTINGS=/prepress preserves 300 DPI images and disables aggressive JPEG recompression. The output file will be larger, but image quality is maintained.
How to create PDF/A with Python
For automated pipelines, you can call Ghostscript from Python or use pikepdf to inspect and manipulate the file. Ghostscript remains the most reliable conversion engine.
import subprocess
from pathlib import Path
def convert_to_pdfa(
input_path: str,
output_path: str,
version: int = 2,
quality: str = "prepress",
) -> None:
"""Convert a PDF to PDF/A using Ghostscript."""
compatibility = "1.4" if version == 1 else "1.7"
subprocess.run(
[
"gs",
f"-dPDFA={version}",
"-dBATCH",
"-dNOPAUSE",
"-sDEVICE=pdfwrite",
"-sProcessColorModel=DeviceRGB",
f"-dCompatibilityLevel={compatibility}",
"-dPDFACompatibilityPolicy=1",
f"-dPDFSETTINGS=/{quality}",
f"-sOutputFile={output_path}",
input_path,
],
check=True,
)
# Convert to PDF/A-2b, high quality
convert_to_pdfa("invoice.pdf", "invoice-pdfa2b.pdf", version=2)Install Ghostscript with brew install ghostscript (macOS) or apt install ghostscript (Debian/Ubuntu).
For batch conversion:
from pathlib import Path
input_dir = Path("pdfs/")
output_dir = Path("pdfs-pdfa/")
output_dir.mkdir(exist_ok=True)
for pdf_file in input_dir.glob("*.pdf"):
output_path = output_dir / pdf_file.name
convert_to_pdfa(str(pdf_file), str(output_path), version=2)
print(f"Converted {pdf_file.name}")Validating PDF/A compliance with VeraPDF
VeraPDF is the open-source reference validator for PDF/A, maintained by the PDF Association and the Open Preservation Foundation. It is the validator referenced by the ISO working group that produced the standard.
Install VeraPDF
# macOS with Homebrew
brew install --cask verapdf
# Or download from verapdf.org/releases and run the installerValidate a file
# Check PDF/A-1b compliance
verapdf --flavour 1b document.pdf
# Check PDF/A-2b compliance
verapdf --flavour 2b document.pdf
# Output as machine-readable JSON
verapdf --flavour 2b --format json document.pdfA compliant file produces:
PASS document.pdf
A non-compliant file lists every violation with its rule number from the ISO specification:
FAIL document.pdf
1:6.1 - Encryption is not permitted
6.2.11.3 - Font is not embedded: Helvetica
Validate in Python
import subprocess
import json
def validate_pdfa(pdf_path: str, flavour: str = "2b") -> dict:
result = subprocess.run(
["verapdf", "--flavour", flavour, "--format", "json", pdf_path],
capture_output=True,
text=True,
)
return json.loads(result.stdout)
report = validate_pdfa("invoice-pdfa2b.pdf")
is_compliant = report["report"]["jobs"][0]["validationResult"]["isCompliant"]
print(f"PDF/A compliant: {is_compliant}")Run validation as part of your build or document generation pipeline to catch compliance regressions before files reach an archive or regulatory system.
PDF/A and Playwright-generated PDFs
Playwright (and by extension any tool that uses Chromium's PDF engine, including PDF4.dev) generates standard PDF output, not PDF/A. Chromium does not embed ICC color profiles or write the required XMP conformance metadata.
This means Playwright-generated PDFs need post-processing to become PDF/A compliant. The workflow is:
- Generate the PDF with Playwright (via the PDF4.dev API or directly)
- Post-process with Ghostscript to convert to PDF/A-2b
- Validate with VeraPDF
import { execSync } from "child_process";
import fs from "fs";
async function generatePdfA(templateId: string, data: object): Promise<Buffer> {
// Step 1: generate standard PDF via the API
const response = await fetch("https://pdf4.dev/api/v1/render", {
method: "POST",
headers: {
Authorization: "Bearer p4_live_your_key",
"Content-Type": "application/json",
},
body: JSON.stringify({ template_id: templateId, data }),
});
const pdfBuffer = Buffer.from(await response.arrayBuffer());
fs.writeFileSync("/tmp/standard.pdf", pdfBuffer);
// Step 2: convert to PDF/A-2b
execSync(
"gs -dPDFA=2 -dBATCH -dNOPAUSE -sDEVICE=pdfwrite " +
"-sProcessColorModel=DeviceRGB -dCompatibilityLevel=1.7 " +
"-dPDFACompatibilityPolicy=1 -dPDFSETTINGS=/prepress " +
"-sOutputFile=/tmp/pdfa.pdf /tmp/standard.pdf"
);
return fs.readFileSync("/tmp/pdfa.pdf");
}For HTML templates, keeping the HTML clean helps: avoid CSS properties that create complex transparency stacks (multiple overlapping opacity layers), which force Ghostscript to flatten and can affect fidelity.
PDF/A in the document generation context
If you generate PDFs from HTML templates (invoices, certificates, reports), PDF/A compliance is often a downstream requirement rather than a generation-time concern. The typical workflow:
- Generate the PDF from an HTML template with accurate, printable layout
- Post-process with Ghostscript if the output must meet a regulatory archive requirement
- Validate with VeraPDF before storage
This separation keeps the generation pipeline focused on layout and data, and the compliance step as a separate, auditable process.
For documents that do not require archival compliance (internal reports, previews, draft contracts), standard PDF output is fine. See how we render PDFs in under 300ms for the rendering pipeline behind PDF4.dev's generation.
If you generate documents that need to meet PDF/A requirements (legal filings, government submissions, e-invoices), add VeraPDF validation as a CI step to catch non-compliance before documents reach the archive.
Summary
PDF/A is an ISO standard for long-term archival that embeds all dependencies (fonts, color profiles) and forbids features requiring external resources (JavaScript, encryption, external files). PDF/A-2b is the best default for most new documents; PDF/A-1b is required by older regulatory systems; PDF/A-3b is the right choice when embedding machine-readable data alongside the visual PDF.
To create PDF/A files, convert with Ghostscript (-dPDFA=2). To validate, use VeraPDF. Playwright-based generators (including PDF4.dev) produce standard PDF and require a Ghostscript post-processing step for PDF/A compliance.
Related tools: compress PDF · edit PDF metadata · protect PDF
Free tools mentioned:
Start generating PDFs
Build PDF templates with a visual editor. Render them via API from any language in ~300ms.