Get started
HTML to PDF benchmark Q3 2026: refreshed numbers, new contenders

HTML to PDF benchmark Q3 2026: refreshed numbers, new contenders

Refreshed HTML to PDF benchmark for Q3 2026: Playwright, Puppeteer, chrome-headless-shell, WeasyPrint, Gotenberg, PDF4.dev. Cold start, warm pool, RAM.

11 min read

Six months after the Q1 2026 HTML to PDF benchmark, we re-ran the same fixture with refreshed runners and two new contenders. The headline: Playwright with a warm browser pool still leads on steady-state latency at roughly 250ms end-to-end per PDF, Chromium got a few percent faster on page.pdf() between version 142 and 148, Gotenberg 8.30 added HTTP/2 streaming that pays off across regions, and chrome-headless-shell is now the leanest-RAM Chromium option for cold-start sensitive workloads.

If you are choosing a stack today and have not read the parent article, start there. This one is the delta.


Why a Q3 refresh

PDF rendering is a moving target. Chromium ships every four weeks. Playwright and Puppeteer chase. WeasyPrint and Gotenberg ship on their own cadence. None of that is loud. The release notes call out the API surface; the rendering hot path improves quietly at the bottom of the fold.

Re-running the same benchmark every six months catches those quiet improvements. It also catches regressions, which happen too. In Q3 2026 the picture is mostly small wins, plus two structural changes worth a closer look: the maturation of chrome-headless-shell as a smaller-footprint Chromium and the new HTTP/2 streaming path in Gotenberg.

The Q1 article is the methodology document. Treat this article as an addendum, not a replacement.


The unchanged methodology

The fixture, the warm-up procedure, and the hardware are identical to the Q1 article.

Recap. Same 10-page A4 invoice template, same Handlebars data set, same MacBook Pro M-series, same Node v22.22.0. Each runner gets a 1000-iteration warm-up, then a 1000-iteration measurement loop. We report median, p95, and peak RSS during a 50-concurrent-renders burst. The full methodology lives in the parent article.

The only deliberate change is the runner lineup, which we expanded from three to six. The fixture HTML is byte-identical to Q1 so any delta you see is the runner, not the input.

We re-ran the Q1 lineup on the new versions too. That gives us a clean Q1-versus-Q3 delta per runner without comparing across templates or hardware.


The contenders, Q3 2026 lineup

RunnerVersionEngineNew since Q1?
Playwright1.58Bundled Chromium 145No, version bump only
PuppeteerRecent lineSystem Chrome 148No, version bump only
chrome-headless-shell145Standalone headless binaryYes, added
WeasyPrint64Pure Python, no browserNo, version bump only
Gotenberg8.30Chromium service over HTTP/2Yes, HTTP/2 streaming added
PDF4.devCurrentManaged Playwright poolNo, included for completeness

Two runners are new since Q1. chrome-headless-shell matured into a real option as Chrome 132 removed the old headless mode from the main browser and the standalone shell took over that role. See the chrome-headless-shell deep dive for the migration story. Gotenberg 8.30 shipped HTTP/2 streaming for PDF responses, which is a structural change we want measured directly.


Refreshed numbers, Q3 vs Q1 delta

Numbers below come from the same fixture re-run on the new versions. Treat single-digit percentages as within measurement noise. The point is the shape, not the third decimal.

RunnerQ1 warm medianQ3 warm medianDeltaQ1 peak RSS (50 concurrent)Q3 peak RSS
Playwright (warm pool)13ms complexapproximately 12ms complexwithin noise~1.3 GB~1.2 GB
Puppeteer (warm pool)58ms complexapproximately 55ms complexsmall improvement~1.4 GB~1.3 GB
chrome-headless-shellnot measured in Q1approximately 14ms complexfirst measurement~0.9 GBfirst measurement
WeasyPrint 64 (cold only)629ms complexapproximately 540ms complexmeaningful, table pathflat ~250 MBflat ~240 MB
Gotenberg 8.30not measured in Q1approximately 220ms LAN, 180ms WAN TTFB with HTTP/2first measurement~1.2 GBfirst measurement
PDF4.dev30ms LAN end to end30ms LAN end to endunchangedn/a (hosted)n/a (hosted)

A few sentences are unchanged on purpose. Playwright warm-pool latency in our fixture is essentially flat. That is not a finding we are hiding; it is the honest result. Chromium's perf work between 142 and 148 helped page.pdf() itself a touch, but the warm-path cost is dominated by page lifecycle, which did not change.

WeasyPrint moved the most. WeasyPrint 64 release notes call out table rendering improvements, and the complex fixture is table-heavy. Simple-document latency moved very little.


Steady-state latency: warm pool wins, still

The warm-pool ranking is unchanged: Playwright leads, chrome-headless-shell is now a close second, Puppeteer follows, then everyone else.

import { chromium, type Browser } from 'playwright';
 
let browser: Browser | null = null;
 
async function getBrowser(): Promise<Browser> {
  if (!browser || !browser.isConnected()) {
    browser = await chromium.launch({ headless: true });
  }
  return browser;
}
 
export async function renderPdf(html: string): Promise<Buffer> {
  const b = await getBrowser();
  const page = await b.newPage();
  try {
    await page.setContent(html, { waitUntil: 'load' });
    const pdf = await page.pdf({ format: 'A4', printBackground: true });
    return Buffer.from(pdf);
  } finally {
    await page.close();
  }
}

The warm-pool pattern is what makes the single-digit numbers possible. Cold every render and you pay 40-200ms of browser launch per call. The Q1 article covers this in detail; the Q3 numbers do not move that conclusion.

What did shift between Q1 and Q3 is the ceiling at which Playwright stays flat. We pushed the burst test to 100 concurrent renders. Q1 numbers degraded past 60. Q3 numbers held to 80 before the same degradation pattern appeared. We attribute this to Chromium's parallel printing path improvements in the 142-148 window; the Chromium release notes describe related work in the print pipeline.


Cold start: chrome-headless-shell makes inroads

chrome-headless-shell is the surprise of the Q3 run on cold start.

The shell binary is smaller than full Chromium because it omits the UI, the extension subsystem, and several platform integrations. The chrome-headless-shell repo describes it as purpose-built for headless automation, with fewer system dependencies. In our fixture, cold start was approximately 25-30% faster than full Chromium for the simple document. The complex document delta was smaller, because the render itself dominates once HTML gets non-trivial.

Two caveats before you migrate.

First, Playwright does not expose chrome-headless-shell as a named channel. You have to point executablePath at a manually downloaded binary, and the Playwright docs caution that the bundled Chromium is the version Playwright is tested with. Use Puppeteer with headless: 'shell' if you want the lean path without going off-piste.

Second, rendering parity is not guaranteed. If your templates depend on subtle print CSS, custom fonts, or complex layouts, run a regression diff against full Chromium before switching. The chrome-headless-shell deep dive covers the decision in detail.


RAM under load: Chromium-based runners still hog memory

Peak resident set size during a 50-concurrent-renders burst, same fixture:

RunnerPeak RSS at 50 concurrentCurve shape
chrome-headless-shell~0.9 GBstairstep
Playwright warm pool~1.2 GBstairstep
Puppeteer warm pool~1.3 GBstairstep
Gotenberg 8.30~1.2 GBstairstep
WeasyPrint 64~240 MBflat

WeasyPrint is the outlier. Because every render is its own Python process, RAM stays flat across the burst. The trade-off is the cold-start latency every call. For high-volume use cases where you want to bound RAM strictly, WeasyPrint is the easiest answer; for latency-bound workloads, the Chromium-based runners win.

The stairstep shape on Chromium-based runners is what you would expect. Pages release memory back to the browser only on page.close(), so peak RSS climbs in steps as the burst opens new pages and plateaus once the pool reaches its ceiling.


Gotenberg's HTTP/2 streaming change: when it matters

Gotenberg 8.30 added HTTP/2 streaming for PDF responses. The Gotenberg release notes describe this as a TTFB optimization for callers that fetch large PDFs over the network. Render time on the server is unchanged. What changes is how soon the first byte arrives at the client.

In our test fixture, the practical effect splits in two.

LAN scenario: caller in the same network as Gotenberg. The TTFB improvement was small (low double-digit milliseconds). At LAN latency, the render dominates and streaming or buffering looks the same.

WAN scenario: caller in a different region. We simulated cross-region by adding 80ms of round-trip latency between caller and Gotenberg. Here the streaming path helped meaningfully: TTFB dropped from approximately 260ms to 180ms because the first PDF bytes started flushing before rendering completed.

If you run Gotenberg in a single region and call it from anywhere, the upgrade is worth it. If you co-locate Gotenberg with the caller, the win is marginal.


PDF4.dev numbers (transparent disclosure)

PDF4.dev runs managed Playwright with warm browser pools. Same engine as bare Playwright, same warm-pool pattern, same Chromium build. The difference is operational: someone else manages the pool, the crashes, the Docker image, and the ARM64 builds.

End-to-end p95 in our Q3 measurement:

  • LAN (same Railway region as the API): approximately 30ms
  • WAN (cross-region, ~80ms RTT): approximately 80ms

Compare like-for-like. Bare Playwright benchmark numbers (3ms warm, 13ms warm complex) measure the engine only and skip network. PDF4.dev numbers include the network round-trip. If you compare 3ms against 30ms, you are comparing engine-only to end-to-end. Subtract your own network cost from the PDF4.dev number to get an apples-to-apples engine view.

This is the same disclosure as Q1. We make it again because the most common misread of the parent article was treating bare-Playwright numbers as production numbers. They are not. Bare Playwright omits queueing, network, and pool warm-up cost. Your real production p95 will be closer to PDF4.dev's number than to the 3ms figure, regardless of who runs the pool.


What this means for your pipeline

The decision matrix is roughly unchanged from Q1, with two new entries.

SituationQ3 2026 recommendation
Already running Playwright with a warm poolStay. Numbers are essentially flat.
Operating Gotenberg in one region, called from manyUpgrade to 8.30, enable HTTP/2 streaming.
Cold-start sensitive (Lambda, Cloud Run scale-to-zero)Test chrome-headless-shell. Smaller binary, faster cold start, regression-diff your templates.
Python stack, server-rendered HTML, no JS in templatesWeasyPrint 64. Table rendering is faster. Still cold every render.
Latency-bound and want to skip opsHosted API (PDF4.dev or equivalent).
Mixed templates with JS-driven contentPlaywright. WeasyPrint and chrome-headless-shell will not execute JS the same way.
Existing Puppeteer codebase, acceptable numbersStay. Migration to Playwright is faster, but the gap closed slightly.

The honest takeaway after two quarters: the engine layer is stable. Most of the wins to be had in your PDF pipeline are not engine choice. They are warm-pool management, queue sizing, network locality, and template quality. Re-running this benchmark every six months catches the engine drift; the rest is on you.


Sources and reproducibility

The fixture HTML, the runner scripts, and the raw measurements are in the same repo as the Q1 article. Run them on your hardware; your numbers will not match ours, but the deltas should.


Frequently asked questions

The FAQ above answers the most common questions about the Q3 update. If you want the full library of decisions (Playwright vs Puppeteer, Playwright vs WeasyPrint, chrome-headless-shell), the parent benchmark, the Playwright vs Puppeteer comparison, and the chrome-headless-shell deep dive cover them in depth.

If you want to skip the pool management entirely, PDF4.dev runs the warm Playwright pool for you with the same numbers as the bare engine plus network. Try it free, no credit card required.

Free tools mentioned:

Html To PdfTry it free

Start generating PDFs

Build PDF templates with a visual editor. Render them via API from any language in ~300ms.