Get started

PDF/A explained: versions, requirements, and how to create archival files

PDF/A is the ISO 19005 standard for long-term PDF archival. Covers PDF/A-1, -2, and -3 differences, technical restrictions, and how to create and validate compliant files.

benoitdedApril 12, 20269 min read

PDF/A is a subset of the PDF format standardized by ISO for long-term document archival. A PDF/A file contains everything needed to display it exactly the same way indefinitely: all fonts, color profiles, and metadata are embedded, and no features that require external resources or runtime behavior are allowed. This guide covers what distinguishes each PDF/A version, what the standard technically prohibits, and how to create and validate compliant files.

What PDF/A actually means

PDF/A stands for "PDF for Archiving." The standard is published as ISO 19005 and has three main parts (PDF/A-1, -2, -3), each released as a separate ISO document.

The design goal is reproducibility: a PDF/A file opened in a PDF viewer in 2076 should look identical to how it looked in 2026, regardless of what fonts, color profiles, or software versions are installed on the system. To achieve this, the standard removes any feature that creates external dependencies.

What PDF/A is not

PDF/A is not a compression format. Converting to PDF/A does not automatically make files smaller. PDF/A is also not the same as PDF/X (which is optimized for print production) or PDF/UA (which is optimized for accessibility, though PDF/A-a conformance includes accessibility tagging).

PDF/A versions compared

There are three ISO standards, each based on a different base PDF version:

VersionBased onReleasedKey additions over previous
PDF/A-1PDF 1.42005First version; no transparency, LZW, or embedded files
PDF/A-2PDF 1.72011Allows transparency, JPEG 2000, PDF/A file attachments
PDF/A-3PDF 1.72012Allows any file type as attachment (XML, CSV, HTML)

Each version also has conformance levels:

LevelWhat it requiresAvailability
/b (basic)Embedded fonts + ICC profiles. No structural tagging requiredPDF/A-1, -2, -3
/a (accessible)Everything in /b plus logical reading order and tagged contentPDF/A-1, -2, -3
/u (unicode)Everything in /b plus Unicode text mappings for all textPDF/A-2, -3 only

When to use which:

  • PDF/A-1b: maximum compatibility with older PDF processors. Required by German courts for legal filings (ZPO), and by many government archive systems that were implemented before 2011.
  • PDF/A-2b: the best default for new documents. Allows CSS transparency (box shadows, opacity), JPEG 2000 for smaller image files, and embedded PDF/A attachments.
  • PDF/A-3b: use when the PDF needs to carry machine-readable structured data, such as ZUGFeRD/Factur-X XML invoices embedded alongside the visual document.

Technical restrictions in PDF/A

PDF/A compliance is defined by what the file is not allowed to contain. The restrictions vary by version:

FeaturePDF/A-1PDF/A-2PDF/A-3
JavaScriptForbiddenForbiddenForbidden
EncryptionForbiddenForbiddenForbidden
External fonts (not embedded)ForbiddenForbiddenForbidden
Audio/video contentForbiddenForbiddenForbidden
Transparency layersForbiddenAllowedAllowed
LZW compressionForbiddenAllowedAllowed
JPEG 2000ForbiddenAllowedAllowed
Embedded non-PDF filesForbiddenForbidden (PDF/A only)Allowed (any type)
XMP metadata declarationRequiredRequiredRequired
ICC color profilesRequiredRequiredRequired

Every font used in the document must be fully embedded. Partial embedding (subsetting) is allowed as long as the subset contains all glyphs used in the document.

XMP metadata is mandatory: the file must include a block declaring its PDF/A version and conformance level, otherwise validators will report it as non-compliant regardless of content.

Who requires PDF/A

PDF/A compliance is mandated in specific regulatory and industry contexts:

ContextVersion requiredRegulation
German court filingsPDF/A-1ZPO Section 130a, ERVV
EU qualified electronic signaturesPDF/A-1 or -2eIDAS Regulation (EU 910/2014)
German federal archivesPDF/A-1b or -2bBArchG, KGSt recommendations
ZUGFeRD/Factur-X e-invoicesPDF/A-3ZUGFeRD 2.0 specification
US federal agencies (e-discovery)PDF/A-1Agency-specific records management policies
Swiss administrative proceedingsPDF/A-1VwVG, Justizdirektionen guidance

Outside regulated contexts, PDF/A is a good default for any document meant to be retained for more than a few years: legal contracts, official certificates, government-issued documents, and financial reports.

How to create PDF/A files with Ghostscript

Ghostscript is the most widely used open-source tool for creating PDF/A files. It can convert any existing PDF to PDF/A-1b or PDF/A-2b.

Convert to PDF/A-1b

gs \
  -dPDFA=1 \
  -dBATCH \
  -dNOPAUSE \
  -sDEVICE=pdfwrite \
  -sProcessColorModel=DeviceRGB \
  -dCompatibilityLevel=1.4 \
  -dPDFACompatibilityPolicy=1 \
  -sOutputFile=output-pdfa1b.pdf \
  input.pdf

-dPDFA=1 sets the target to PDF/A-1. -dPDFACompatibilityPolicy=1 tells Ghostscript to convert incompatible features rather than abort. Without this flag, a source PDF with transparency will cause an error instead of being converted.

Convert to PDF/A-2b

gs \
  -dPDFA=2 \
  -dBATCH \
  -dNOPAUSE \
  -sDEVICE=pdfwrite \
  -sProcessColorModel=DeviceRGB \
  -dCompatibilityLevel=1.7 \
  -dPDFACompatibilityPolicy=1 \
  -sOutputFile=output-pdfa2b.pdf \
  input.pdf

The only difference is -dPDFA=2 and -dCompatibilityLevel=1.7.

Image quality during conversion

Ghostscript resamples images during conversion. To preserve image quality:

gs \
  -dPDFA=2 \
  -dBATCH -dNOPAUSE \
  -sDEVICE=pdfwrite \
  -sProcessColorModel=DeviceRGB \
  -dCompatibilityLevel=1.7 \
  -dPDFACompatibilityPolicy=1 \
  -dPDFSETTINGS=/prepress \
  -sOutputFile=output-high-quality.pdf \
  input.pdf

-dPDFSETTINGS=/prepress preserves 300 DPI images and disables aggressive JPEG recompression. The output file will be larger, but image quality is maintained.

How to create PDF/A with Python

For automated pipelines, you can call Ghostscript from Python or use pikepdf to inspect and manipulate the file. Ghostscript remains the most reliable conversion engine.

import subprocess
from pathlib import Path
 
def convert_to_pdfa(
    input_path: str,
    output_path: str,
    version: int = 2,
    quality: str = "prepress",
) -> None:
    """Convert a PDF to PDF/A using Ghostscript."""
    compatibility = "1.4" if version == 1 else "1.7"
 
    subprocess.run(
        [
            "gs",
            f"-dPDFA={version}",
            "-dBATCH",
            "-dNOPAUSE",
            "-sDEVICE=pdfwrite",
            "-sProcessColorModel=DeviceRGB",
            f"-dCompatibilityLevel={compatibility}",
            "-dPDFACompatibilityPolicy=1",
            f"-dPDFSETTINGS=/{quality}",
            f"-sOutputFile={output_path}",
            input_path,
        ],
        check=True,
    )
 
 
# Convert to PDF/A-2b, high quality
convert_to_pdfa("invoice.pdf", "invoice-pdfa2b.pdf", version=2)

Install Ghostscript with brew install ghostscript (macOS) or apt install ghostscript (Debian/Ubuntu).

For batch conversion:

from pathlib import Path
 
input_dir = Path("pdfs/")
output_dir = Path("pdfs-pdfa/")
output_dir.mkdir(exist_ok=True)
 
for pdf_file in input_dir.glob("*.pdf"):
    output_path = output_dir / pdf_file.name
    convert_to_pdfa(str(pdf_file), str(output_path), version=2)
    print(f"Converted {pdf_file.name}")

Validating PDF/A compliance with VeraPDF

VeraPDF is the open-source reference validator for PDF/A, maintained by the PDF Association and the Open Preservation Foundation. It is the validator referenced by the ISO working group that produced the standard.

Install VeraPDF

# macOS with Homebrew
brew install --cask verapdf
 
# Or download from verapdf.org/releases and run the installer

Validate a file

# Check PDF/A-1b compliance
verapdf --flavour 1b document.pdf
 
# Check PDF/A-2b compliance
verapdf --flavour 2b document.pdf
 
# Output as machine-readable JSON
verapdf --flavour 2b --format json document.pdf

A compliant file produces:

PASS document.pdf

A non-compliant file lists every violation with its rule number from the ISO specification:

FAIL document.pdf
  1:6.1 - Encryption is not permitted
  6.2.11.3 - Font is not embedded: Helvetica

Validate in Python

import subprocess
import json
 
def validate_pdfa(pdf_path: str, flavour: str = "2b") -> dict:
    result = subprocess.run(
        ["verapdf", "--flavour", flavour, "--format", "json", pdf_path],
        capture_output=True,
        text=True,
    )
    return json.loads(result.stdout)
 
report = validate_pdfa("invoice-pdfa2b.pdf")
is_compliant = report["report"]["jobs"][0]["validationResult"]["isCompliant"]
print(f"PDF/A compliant: {is_compliant}")

Run validation as part of your build or document generation pipeline to catch compliance regressions before files reach an archive or regulatory system.

PDF/A and Playwright-generated PDFs

Playwright (and by extension any tool that uses Chromium's PDF engine, including PDF4.dev) generates standard PDF output, not PDF/A. Chromium does not embed ICC color profiles or write the required XMP conformance metadata.

This means Playwright-generated PDFs need post-processing to become PDF/A compliant. The workflow is:

  1. Generate the PDF with Playwright (via the PDF4.dev API or directly)
  2. Post-process with Ghostscript to convert to PDF/A-2b
  3. Validate with VeraPDF
import { execSync } from "child_process";
import fs from "fs";
 
async function generatePdfA(templateId: string, data: object): Promise<Buffer> {
  // Step 1: generate standard PDF via the API
  const response = await fetch("https://pdf4.dev/api/v1/render", {
    method: "POST",
    headers: {
      Authorization: "Bearer p4_live_your_key",
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ template_id: templateId, data }),
  });
 
  const pdfBuffer = Buffer.from(await response.arrayBuffer());
  fs.writeFileSync("/tmp/standard.pdf", pdfBuffer);
 
  // Step 2: convert to PDF/A-2b
  execSync(
    "gs -dPDFA=2 -dBATCH -dNOPAUSE -sDEVICE=pdfwrite " +
      "-sProcessColorModel=DeviceRGB -dCompatibilityLevel=1.7 " +
      "-dPDFACompatibilityPolicy=1 -dPDFSETTINGS=/prepress " +
      "-sOutputFile=/tmp/pdfa.pdf /tmp/standard.pdf"
  );
 
  return fs.readFileSync("/tmp/pdfa.pdf");
}

For HTML templates, keeping the HTML clean helps: avoid CSS properties that create complex transparency stacks (multiple overlapping opacity layers), which force Ghostscript to flatten and can affect fidelity.

PDF/A in the document generation context

If you generate PDFs from HTML templates (invoices, certificates, reports), PDF/A compliance is often a downstream requirement rather than a generation-time concern. The typical workflow:

  1. Generate the PDF from an HTML template with accurate, printable layout
  2. Post-process with Ghostscript if the output must meet a regulatory archive requirement
  3. Validate with VeraPDF before storage

This separation keeps the generation pipeline focused on layout and data, and the compliance step as a separate, auditable process.

For documents that do not require archival compliance (internal reports, previews, draft contracts), standard PDF output is fine. See how we render PDFs in under 300ms for the rendering pipeline behind PDF4.dev's generation.

If you generate documents that need to meet PDF/A requirements (legal filings, government submissions, e-invoices), add VeraPDF validation as a CI step to catch non-compliance before documents reach the archive.

Summary

PDF/A is an ISO standard for long-term archival that embeds all dependencies (fonts, color profiles) and forbids features requiring external resources (JavaScript, encryption, external files). PDF/A-2b is the best default for most new documents; PDF/A-1b is required by older regulatory systems; PDF/A-3b is the right choice when embedding machine-readable data alongside the visual PDF.

To create PDF/A files, convert with Ghostscript (-dPDFA=2). To validate, use VeraPDF. Playwright-based generators (including PDF4.dev) produce standard PDF and require a Ghostscript post-processing step for PDF/A compliance.

Related tools: compress PDF · edit PDF metadata · protect PDF

Free tools mentioned:

Compress PdfTry it freeMetadata PdfTry it freeProtect PdfTry it free

Start generating PDFs

Build PDF templates with a visual editor. Render them via API from any language in ~300ms.