What is the best Python library for generating PDFs from HTML?

WeasyPrint is the most popular pure-Python option. Playwright with headless Chromium gives more accurate CSS rendering. For production workloads, a REST API like PDF4.dev removes the need to manage browser dependencies.

Can I generate PDFs in Python without installing system dependencies?

WeasyPrint requires Pango and Cairo system libraries. Playwright requires a Chromium binary. A REST API is the only fully dependency-free option: just send an HTTP request.

How do I generate PDFs in Flask?

Render an HTML template with Jinja2, convert it with WeasyPrint or a PDF API, then return the PDF bytes as a Flask response with content-type application/pdf.

How do I generate PDFs in FastAPI?

Use an async HTTP client like httpx to call a PDF API, or run WeasyPrint in a thread pool via asyncio.to_thread() to avoid blocking the event loop.

Is WeasyPrint good for production use?

WeasyPrint works well for moderate volume (hundreds of PDFs per day). At higher scale, it becomes a bottleneck because it runs synchronously, has no concurrency built in, and requires system libraries (Pango, Cairo) that complicate Docker images.

How do I use Jinja2 templates to generate PDF invoices in Python?

Render a Jinja2 HTML template to a string with render_template() (Flask) or Environment().get_template().render() (standalone), then pass the HTML string to your PDF library or API.

What is the difference between WeasyPrint and pdfkit?

pdfkit is a Python wrapper around wkhtmltopdf, a deprecated C binary that stopped being maintained in 2023. WeasyPrint is a maintained pure-Python library. For new projects, use WeasyPrint or Playwright instead of pdfkit.

How fast is Python PDF generation in production?

WeasyPrint typically takes 300-800ms per page for a simple document. Playwright with a warm browser pool runs at 200-500ms. A managed PDF API can return results in under 300ms with no cold-start penalty.

Can I generate PDFs in a Python serverless function (AWS Lambda, Google Cloud Functions)?

Yes, but it is complex. WeasyPrint requires Cairo and Pango system libs in the Lambda layer. Playwright requires a bundled Chromium (~300MB). A PDF REST API is the simplest approach for serverless environments.

Does WeasyPrint support CSS Grid and Flexbox?

WeasyPrint has partial CSS Grid support (improved in v60+) and good Flexbox support. Complex Grid layouts may render differently from a browser. Playwright uses full Chromium and supports all modern CSS.

Developer Guides

Generate PDFs from HTML in Python: WeasyPrint, Playwright, and PDF APIs compared

Generate PDFs from HTML in Python using WeasyPrint, Playwright, or a REST API. Covers Flask, FastAPI, dynamic templates, fonts, and production tips.

benoitdedMarch 17, 202611 min read

On this page

Approach comparison
Option 1: WeasyPrint
Installation
Basic conversion
Dynamic templates with Jinja2
Flask endpoint
FastAPI endpoint
Option 2: Playwright
Installation
Basic PDF generation
Singleton browser pattern for production
When the DIY approach starts to hurt
WeasyPrint pain points
Playwright pain points
The break-even point
Option 3: PDF REST API (PDF4.dev)
Installation
Generate a PDF with raw HTML
Use a saved template
Async FastAPI with httpx
Choosing the right approach for your project
Use WeasyPrint if:
Use Playwright if:
Use a PDF API if:
Font handling in Python PDF generation
PDF format options: A4, letter, custom sizes
Production checklist
Summary

Generating PDFs from HTML in Python is a solved problem, but the solution you pick at prototype time often becomes a production bottleneck. This guide covers three approaches: WeasyPrint (pure Python), Playwright (headless Chromium), and a REST API. You will see working code for each, plus a clear breakdown of when to switch.

Approach comparison

Before diving into code, here is a direct comparison of the three main options for Python PDF generation.

Approach	CSS accuracy	Install complexity	Async support	Docker size	Best for
WeasyPrint	Good (CSS 2.1, partial CSS3)	Medium (needs Pango/Cairo)	No (blocks event loop)	~150MB extra	Most Python projects
Playwright	Excellent (full Chromium)	High (Chromium binary ~300MB)	Yes (async API)	~300MB extra	Pixel-perfect, complex CSS
PDF REST API	Excellent (hosted Chromium)	None (HTTP client only)	Yes	0	Production, serverless, teams
pdfkit/wkhtmltopdf	Poor (deprecated engine)	High (C binary)	No	~200MB extra	Legacy projects only
ReportLab	n/a (not HTML-based)	Low	No	~15MB	Pure programmatic docs

Note: pdfkit wraps wkhtmltopdf, which stopped active development in 2023. Avoid it for new projects.

Option 1: WeasyPrint

WeasyPrint is a Python library that converts HTML and CSS to PDF using the Pango text layout engine and the Cairo graphics library. It supports CSS 2.1 plus most commonly used CSS3 properties.

Installation

pip install weasyprint

macOS (Homebrew):

brew install pango

Debian/Ubuntu:

apt-get install libpango-1.0-0 libcairo2 libgdk-pixbuf2.0-0 libffi-dev

Basic conversion

from weasyprint import HTML
 
html_string = """
<!DOCTYPE html>
<html>
<head>
  <style>
    body { font-family: Arial, sans-serif; margin: 20mm; }
    h1 { color: #111; font-size: 24px; }
    .total { font-size: 18px; font-weight: bold; }
  </style>
</head>
<body>
  <h1>Invoice #001</h1>
  <p>Client: Acme Corp</p>
  <p class="total">Total: $1,500.00</p>
</body>
</html>
"""
 
pdf_bytes = HTML(string=html_string).write_pdf()
 
# Save to file
with open("invoice.pdf", "wb") as f:
    f.write(pdf_bytes)

Dynamic templates with Jinja2

In practice, you almost always need dynamic data. Jinja2 is the standard Python templating engine for this.

from jinja2 import Environment, FileSystemLoader
from weasyprint import HTML
 
# Load template from file
env = Environment(loader=FileSystemLoader("templates/"))
template = env.get_template("invoice.html")
 
# Render with data
data = {
    "invoice_number": "INV-0042",
    "client_name": "Acme Corp",
    "items": [
        {"description": "Consulting", "qty": 10, "unit_price": 150.00},
        {"description": "Setup fee", "qty": 1, "unit_price": 200.00},
    ],
    "total": 1700.00,
}
 
html_string = template.render(**data)
pdf_bytes = HTML(string=html_string).write_pdf()

templates/invoice.html:

<!DOCTYPE html>
<html>
<head>
  <style>
    body { font-family: Arial, sans-serif; margin: 20mm; color: #333; }
    table { width: 100%; border-collapse: collapse; }
    th, td { padding: 8px 12px; border-bottom: 1px solid #eee; text-align: left; }
    .total-row { font-weight: bold; font-size: 16px; }
  </style>
</head>
<body>
  <h1>Invoice {{ invoice_number }}</h1>
  <p>Client: {{ client_name }}</p>
  <table>
    <thead>
      <tr><th>Description</th><th>Qty</th><th>Unit Price</th><th>Subtotal</th></tr>
    </thead>
    <tbody>
      {% for item in items %}
      <tr>
        <td>{{ item.description }}</td>
        <td>{{ item.qty }}</td>
        <td>${{ "%.2f"|format(item.unit_price) }}</td>
        <td>${{ "%.2f"|format(item.qty * item.unit_price) }}</td>
      </tr>
      {% endfor %}
    </tbody>
    <tfoot>
      <tr class="total-row"><td colspan="3">Total</td><td>${{ "%.2f"|format(total) }}</td></tr>
    </tfoot>
  </table>
</body>
</html>

Flask endpoint

from flask import Flask, make_response
from jinja2 import Environment, FileSystemLoader
from weasyprint import HTML
 
app = Flask(__name__)
env = Environment(loader=FileSystemLoader("templates/"))
 
@app.route("/invoice/<invoice_id>.pdf")
def generate_invoice(invoice_id):
    # Fetch data (replace with your DB query)
    data = {
        "invoice_number": invoice_id,
        "client_name": "Acme Corp",
        "total": 1700.00,
        "items": [],
    }
 
    template = env.get_template("invoice.html")
    html_string = template.render(**data)
    pdf_bytes = HTML(string=html_string).write_pdf()
 
    response = make_response(pdf_bytes)
    response.headers["Content-Type"] = "application/pdf"
    response.headers["Content-Disposition"] = f"inline; filename=invoice-{invoice_id}.pdf"
    return response

FastAPI endpoint

WeasyPrint is synchronous and will block the FastAPI event loop if called directly. Use asyncio.to_thread() to run it in a thread pool.

import asyncio
from fastapi import FastAPI
from fastapi.responses import Response
from jinja2 import Environment, FileSystemLoader
from weasyprint import HTML
 
app = FastAPI()
env = Environment(loader=FileSystemLoader("templates/"))
 
@app.get("/invoice/{invoice_id}.pdf")
async def generate_invoice(invoice_id: str):
    data = {"invoice_number": invoice_id, "client_name": "Acme Corp", "total": 1700.00, "items": []}
    template = env.get_template("invoice.html")
    html_string = template.render(**data)
 
    # Run WeasyPrint in a thread pool to avoid blocking
    pdf_bytes = await asyncio.to_thread(lambda: HTML(string=html_string).write_pdf())
 
    return Response(
        content=pdf_bytes,
        media_type="application/pdf",
        headers={"Content-Disposition": f"inline; filename=invoice-{invoice_id}.pdf"},
    )

Option 2: Playwright

Playwright is the Python bindings for Microsoft's browser automation library. It uses a real Chromium browser, which means it supports the full CSS3 spec including CSS Grid, Flexbox, custom properties, and @font-face.

Installation

pip install playwright
playwright install chromium

Basic PDF generation

import asyncio
from playwright.async_api import async_playwright
 
async def html_to_pdf(html_string: str) -> bytes:
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()
        await page.set_content(html_string, wait_until="networkidle")
        pdf_bytes = await page.pdf(
            format="A4",
            margin={"top": "20mm", "bottom": "20mm", "left": "15mm", "right": "15mm"},
            print_background=True,
        )
        await browser.close()
        return pdf_bytes
 
# Usage
html = "<html><body><h1>Hello PDF</h1></body></html>"
pdf = asyncio.run(html_to_pdf(html))

Singleton browser pattern for production

Launching a new browser for every request adds 200-400ms of startup time. Keep a single browser instance open across requests.

from playwright.async_api import async_playwright, Browser
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.responses import Response
 
browser: Browser | None = None
 
@asynccontextmanager
async def lifespan(app: FastAPI):
    global browser
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        yield
        await browser.close()
 
app = FastAPI(lifespan=lifespan)
 
@app.get("/invoice/{invoice_id}.pdf")
async def generate_invoice(invoice_id: str):
    html = f"<html><body><h1>Invoice {invoice_id}</h1></body></html>"
    page = await browser.new_page()
    await page.set_content(html, wait_until="load")
    pdf_bytes = await page.pdf(format="A4", print_background=True)
    await page.close()
    return Response(content=pdf_bytes, media_type="application/pdf")

This pattern brings PDF generation down to ~150-300ms per request on a warm browser.

When the DIY approach starts to hurt

Both WeasyPrint and Playwright work well for low-to-medium volume. At production scale, several pain points appear.

WeasyPrint pain points

No concurrency. WeasyPrint is a synchronous Python library with no built-in worker pool. At 10+ concurrent requests, you need a process pool or task queue (Celery, RQ), which adds infrastructure.

System dependencies. Pango and Cairo are C libraries. Every deployment target (Docker, Lambda, CI) needs them. Debugging "libpango not found" in production is time you could spend building features.

CSS compatibility gaps. WeasyPrint does not support JavaScript, CSS animations, or some newer Grid/Flexbox features. Complex designs that work in a browser may render differently.

Playwright pain points

Docker image size. A minimal Python image with Playwright and Chromium weighs ~550MB. Cold starts on AWS Lambda or Google Cloud Run can hit 5-10 seconds when the container is cold.

Concurrency management. Managing a browser pool, page lifecycle, and graceful shutdown under load requires careful code. Memory leaks from unclosed pages are a common production issue.

Serverless limitations. AWS Lambda has a 250MB unzipped deployment package limit. Chromium alone exceeds that. You need a Lambda layer or a custom container image, both of which are non-trivial to maintain.

The break-even point

The table below summarizes when the operational overhead of self-hosted PDF generation stops being worth it.

Signal	Self-hosted (WeasyPrint/Playwright)	REST API (PDF4.dev)
PDFs per day	Under 500	Any volume
Team size	Solo or 1-2 devs	Any
Deployment target	Traditional server	Serverless, Lambda, edge
Docker size budget	No constraint	Size-constrained
CSS complexity	Simple documents	Complex HTML/CSS
Multi-language (fonts, RTL)	Needs custom setup	Handled by API
SLA requirement	DIY monitoring	Managed

Option 3: PDF REST API (PDF4.dev)

PDF4.dev is an HTML-to-PDF REST API. You send an HTTP request with your HTML and data, and get a PDF back. The rendering is done by a managed Chromium instance with no infrastructure on your side.

Installation

No system dependencies. Just an HTTP client.

pip install httpx  # or use requests

Generate a PDF with raw HTML

import httpx
 
API_KEY = "p4_live_your_api_key"
 
def generate_pdf(html: str, data: dict = None) -> bytes:
    response = httpx.post(
        "https://pdf4.dev/api/v1/render",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={"html": html, "data": data or {}},
    )
    response.raise_for_status()
    return response.content
 
# Example
html = """
<html>
<head>
  <style>
    body { font-family: Inter, sans-serif; margin: 20mm; }
    h1 { color: #111; }
  </style>
</head>
<body>
  <h1>Invoice {{invoice_number}}</h1>
  <p>Client: {{client_name}}</p>
  <p>Total: {{total}}</p>
</body>
</html>
"""
 
pdf_bytes = generate_pdf(html, {
    "invoice_number": "INV-0042",
    "client_name": "Acme Corp",
    "total": "$1,700.00",
})
 
with open("invoice.pdf", "wb") as f:
    f.write(pdf_bytes)

The API uses Handlebars syntax ({{variable}}) for templating, with built-in helpers for formatting dates, numbers, and currencies.

Use a saved template

PDF4.dev lets you save HTML templates in the dashboard and reference them by slug. This separates template design from application code.

import httpx
 
API_KEY = "p4_live_your_api_key"
 
def render_template(template_id: str, data: dict) -> bytes:
    response = httpx.post(
        "https://pdf4.dev/api/v1/render",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={"template_id": template_id, "data": data},
    )
    response.raise_for_status()
    return response.content
 
pdf_bytes = render_template("invoice", {
    "invoice_number": "INV-0042",
    "client_name": "Acme Corp",
    "items": [
        {"description": "Consulting", "qty": 10, "unit_price": 150},
        {"description": "Setup fee", "qty": 1, "unit_price": 200},
    ],
    "total": 1700,
})

Async FastAPI with httpx

import httpx
from fastapi import FastAPI
from fastapi.responses import Response
 
app = FastAPI()
API_KEY = "p4_live_your_api_key"
 
@app.get("/invoice/{invoice_id}.pdf")
async def generate_invoice(invoice_id: str):
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://pdf4.dev/api/v1/render",
            headers={"Authorization": f"Bearer {API_KEY}"},
            json={
                "template_id": "invoice",
                "data": {
                    "invoice_number": invoice_id,
                    "client_name": "Acme Corp",
                    "total": 1700.00,
                },
            },
            timeout=30.0,
        )
        response.raise_for_status()
 
    return Response(
        content=response.content,
        media_type="application/pdf",
        headers={"Content-Disposition": f"inline; filename=invoice-{invoice_id}.pdf"},
    )

No browser to manage, no Pango/Cairo, no Docker image bloat.

Choosing the right approach for your project

Use this decision tree to pick the right tool.

Use WeasyPrint if:

You want a pure-Python solution with no external HTTP calls
Your documents use standard CSS (tables, simple layouts, no Grid/complex Flexbox)
You can afford to add Pango/Cairo to your Docker image
Volume is under a few hundred PDFs per day

Use Playwright if:

You need pixel-perfect CSS rendering (full Grid, CSS custom properties, JS-rendered content)
You are already using Playwright for end-to-end testing
You are comfortable managing a browser pool

Use a PDF API if:

You are deploying to serverless (Lambda, Cloud Functions, Vercel)
You want zero system dependencies in your Docker image
You need reliable concurrency without building a worker pool
You want designers to edit PDF templates in a UI without changing Python code

Font handling in Python PDF generation

Font rendering is a frequent source of rendering inconsistencies between development and production.

WeasyPrint uses Pango for text rendering. To use a custom font, embed it via @font-face in your CSS and pass a base_url so WeasyPrint can resolve the file path:

from weasyprint import HTML, CSS
 
html = "<html><body><p>Hello</p></body></html>"
css = CSS(string="""
  @font-face {
    font-family: 'Inter';
    src: url('/path/to/Inter-Regular.ttf');
  }
  body { font-family: 'Inter', sans-serif; }
""")
 
pdf = HTML(string=html).write_pdf(stylesheets=[css])

Playwright loads fonts the same way a browser does. Use @font-face with a relative URL and set the base_url argument in page.set_content() if needed. Google Fonts work as long as you wait for networkidle when setting content.

PDF4.dev supports Google Fonts via the google_fonts_url field in the format options, and supports @font-face with any URL.

PDF format options: A4, letter, custom sizes

All three approaches support A4, Letter, and custom page sizes.

Format	WeasyPrint	Playwright	PDF4.dev (format.preset)
A4 portrait	`@page { size: A4; }`	`format="A4"`	`"a4"`
A4 landscape	`@page { size: A4 landscape; }`	`format="A4", landscape=True`	`"a4-landscape"`
Letter	`@page { size: letter; }`	`format="Letter"`	`"letter"`
Custom	`@page { size: 150mm 200mm; }`	`width="150mm", height="200mm"`	`preset="custom", width="150mm", height="200mm"`

For custom margins in WeasyPrint:

@page {
  size: A4;
  margin: 20mm 15mm 20mm 15mm;
}

Production checklist

Before deploying PDF generation to production, verify these points regardless of which approach you use.

Checklist item	WeasyPrint	Playwright	PDF API
System libs installed in Docker	Pango, Cairo required	Chromium required	None needed
Concurrency handled	Process pool or task queue	Browser page pool	Handled by API
Timeouts configured	Thread timeout	`page.pdf()` timeout	HTTP client timeout (30s)
Error handling for malformed HTML	Try/except around write_pdf()	Try/except around page.pdf()	Check HTTP status code
Logging render duration	`time.time()` around call	`time.time()` around call	Check response headers
Fonts available in container	Font files copied in Docker	System fonts or @font-face	Embedded or Google Fonts

Summary

Generating PDFs from HTML in Python has three solid paths. WeasyPrint is the quickest to add to an existing Python project. Playwright gives the most accurate rendering. A REST API eliminates all system dependencies and is the only realistic option for serverless environments.

For most Flask or FastAPI projects, start with WeasyPrint. If CSS accuracy becomes a problem or you need to scale beyond a few hundred PDFs per day, switching to a managed API requires only changing one function call.

You can try PDF generation with the HTML to PDF converter or sign up at PDF4.dev to use the full API with saved templates, Handlebars variables, and a visual editor.

Free tools mentioned:

Html To PdfTry it free

Start generating PDFs

Build PDF templates with a visual editor. Render them via API from any language in ~300ms.

Get Started free API Docs

Developer Guides

Generate PDFs from HTML templates with Node.js

Build a Node.js PDF generator using Handlebars templates and Playwright. Covers dynamic data, styling, fonts, Docker, and when a PDF API makes more sense.

Feb 18, 20269 min read

Developer Guides

HTML to PDF benchmark 2026 (Playwright vs Puppeteer vs WeasyPrint)

Playwright vs Puppeteer vs WeasyPrint: real HTML-to-PDF latency and file size, Node.js and Python usage, macOS and Linux, plus the production gotchas inside.

Mar 17, 202613 min read

Business Documents

How to generate PDF invoices programmatically

Build automated invoice generation with HTML templates, dynamic data, and a single API call. Covers design, multi-language code examples, compliance, and scaling.

Mar 11, 202615 min read

Start generating PDFs

Related Articles

Generate PDFs from HTML templates with Node.js

HTML to PDF benchmark 2026 (Playwright vs Puppeteer vs WeasyPrint)

How to generate PDF invoices programmatically