Is my Playwright server vulnerable to CVE-2026-44439?

Only if you use the PlaywrightCapture Python wrapper at version 1.39.5 or earlier. The CVE is scoped to that specific library, which adds page-capture orchestration on top of Playwright. The bug is that PlaywrightCapture did not block navigations to loopback, link-local, RFC1918, or `file://` targets when the page issued a JavaScript redirect (`window.location.href = ...`) or rendered an iframe pointing at an internal URL. Upgrade to 1.39.6 or later and pass `only_global_lookup=True` (the new default). However, the same attack pattern affects any HTML-to-PDF pipeline built on raw Playwright, Puppeteer, or headless Chrome that does not intercept and filter requests at the browser level. The library being CVE'd is one of many.

Does this affect Puppeteer too?

Yes, structurally. Puppeteer is a JavaScript Chrome controller with the same surface as Playwright: by default, a page is allowed to navigate to any URL the underlying Chromium binary can reach, including `file://`, `127.0.0.1`, `169.254.169.254`, and any RFC1918 address routable from the host. Puppeteer has no built-in SSRF filter. Operators must add their own using `page.setRequestInterception(true)` plus a request handler that aborts disallowed targets, or block egress at the network namespace. CVE-2026-44439 is a Python-wrapper finding, but the underlying browser primitive is identical in Puppeteer and any other headless-Chrome controller.

Why does the renderer even resolve private IPs?

Because Chromium is a general-purpose web browser. It was designed to fetch any URL the operating system's network stack can reach, including private IPs on the host's LAN, link-local addresses for IPv6 autoconfiguration, and `file://` paths for local-disk access. None of those resolutions are bugs in Chromium: they are documented features for the browser's primary use case. The mismatch is that an HTML-to-PDF service running Chromium on a server inherits all of those reachable destinations, even though the server's network position is fundamentally different from a desktop user's. The only safe assumption for an HTML-to-PDF service is that the browser must be explicitly constrained, not implicitly trusted.

Can I just block 169.254.169.254 in my firewall?

It is the right first step and the wrong only step. Blocking the AWS metadata endpoint at the host iptables level (a single `-A OUTPUT -d 169.254.169.254 -j DROP` rule) stops the most obvious credential-theft path. But it leaves every other private destination open: loopback services, internal HTTP admin endpoints, RFC1918 ranges, GCP `metadata.google.internal`, Azure's `169.254.169.254/metadata` endpoint (same IP, different path with a required header), and `file://` URIs. A complete mitigation blocks all RFC1918 networks plus `169.254.0.0/16` plus the IPv6 link-local range `fe80::/10`, both inbound from the renderer process and outbound at the container egress.

How do I prevent DNS rebinding attacks on my PDF service?

Resolve the hostname once at intercept time, validate the resolved IP against your blocklist, and pin that IP for the duration of the fetch. The classic DNS-rebinding bypass is: an attacker controls a hostname whose first A-record lookup returns a public IP (which passes your blocklist) and whose second lookup, seconds later, returns a private IP (which the fetch then hits). Defeating it requires that the IP your check validates is the same IP your fetch uses. In Playwright, the practical pattern is to do the DNS resolution in your `page.route()` handler, reject if any resolved address is private, and rewrite the request URL to a pinned IP literal before passing it through. Egress filters on the network namespace remain the second layer.

Are managed PDF APIs like PDF4.dev exposed?

A well-built managed API should not be, but the question is worth asking your vendor. The minimum bar is: every Chromium fetch is intercepted at the browser level, `file://` and `data:` (for top-level navigation) are denied by default, RFC1918 and link-local destinations are rejected, the renderer runs in a container with whitelist-only egress, and DNS resolution is pinned at intercept time to defeat rebinding. PDF4.dev implements all five. Many self-hosted setups and some smaller hosted services implement none, which is exactly the gap CVE-2026-44439 highlights. If you cannot get a written answer from a vendor on how they handle SSRF in their renderer, treat their service as exposed.

What is the safest way to render user-submitted HTML to PDF?

Treat the renderer as a hostile network endpoint and isolate accordingly. The hardened pattern: run Chromium in a network namespace whose egress allowlist contains only the public IPs of explicitly trusted fonts, CDNs, and any APIs the templates actually need; intercept every request via `page.route()` and reject `file://`, `data:` top-level navigation, loopback, link-local, RFC1918, and the cloud metadata endpoints; resolve hostnames once at intercept time and reject any hostname whose A-record set contains a private IP; cap the total fetch count per render to bound resource use; log every blocked request with the originating template id so you can detect probing. Each layer catches a different bypass; together they take the entire SSRF class off the table.

Does the file:// scheme need to be disabled even with network filters?

Yes. Network-layer filters operate on IP packets and do nothing about `file://` URIs, which never leave the renderer process. `file:///etc/passwd` is read directly by Chromium from the local filesystem, no socket involved. Defeating local-file SSRF requires explicit handling at the request-interception layer: in Playwright, your `page.route('**/*')` handler must check `request.url().startsWith('file://')` and abort the request. Additionally, the Chromium command-line flag `--disable-file-system-access` and a tight container filesystem (read-only root, minimal mounts) reduce the blast radius if any future bug bypasses the request-interception layer.

News

CVE-2026-44439: SSRF in HTML-to-PDF is an underrated whole class

CVE-2026-44439 lets attacker HTML reach private IPs and file:// URLs during page capture. Every HTML-to-PDF API has this exposure. Detect, mitigate, harden.

AxelMay 25, 202612 min read

CVE-2026-44439 is a server-side request forgery in PlaywrightCapture, a Python wrapper that orchestrates headless-browser page capture on top of Playwright. The bug lets attacker-controlled HTML pivot the renderer to file:// paths, loopback services, link-local cloud metadata endpoints, and RFC1918 private networks during page rendering. It is fixed in PlaywrightCapture 1.39.6 via a new only_global_lookup default. The bigger story is that this same attack pattern reaches every HTML-to-PDF service that accepts user HTML and renders it server-side. Treat the CVE as a wake-up call about a whole class, not a single library.

What CVE-2026-44439 actually does

CVE-2026-44439 is a server-side request forgery (SSRF) primitive in PlaywrightCapture, the Python orchestration library that wraps Playwright to "safely" capture web pages. The GitLab advisory classifies it as CWE-918 (SSRF), and the DailyCVE summary lists Medium severity. The fix shipped in version 1.39.6 with a new only_global_lookup flag, defaulting to True, that filters resolved IPs to public-routable addresses only.

The attack mechanism is direct. PlaywrightCapture loads a target URL, waits for the page to settle, and serializes the rendered DOM. Before 1.39.6, the library did not interpose on in-page navigations once rendering started. An attacker-controlled page could ship JavaScript that redirected the renderer to a forbidden target after the initial check had passed.

A minimal proof-of-concept payload:

<!DOCTYPE html>
<html>
<head>
  <title>Looks innocent</title>
</head>
<body>
  <h1>Hello world</h1>
  <script>
    // Pivot the renderer to AWS instance metadata
    window.location.href =
      'http://169.254.169.254/latest/meta-data/iam/security-credentials/';
  </script>
</body>
</html>

When PlaywrightCapture renders this page, the JavaScript redirect fires inside the Chromium tab. The renderer fetches the metadata endpoint, the response body is rendered as text, and the captured output now contains the IAM role credentials that the EC2 instance was running with. The attacker submits one URL, receives a PDF (or HTML snapshot, in PlaywrightCapture's case) containing the temporary AWS credentials of the host running the capture service.

The same primitive works for file://:

<script>
  window.location.href = 'file:///etc/passwd';
</script>

And for iframes (which the redirect-style payload generalizes to):

<iframe src="http://10.0.0.42:8080/admin/health" style="width:100%;height:600px"></iframe>

In each case, the attacker submits HTML, the renderer reaches a destination it was never meant to reach, and the response body comes back to the attacker through the rendered output.

If you operate any HTML-to-PDF service built on Playwright, Puppeteer, or raw headless Chromium and you do NOT explicitly intercept and filter requests at the browser level, this attack pattern works against you today. The PlaywrightCapture CVE is a single library's CVE; the attack pattern is library-agnostic. The fix in 1.39.6 is a defense for one wrapper; the underlying primitive (browsers fetch any URL by default) is unchanged in every other wrapper.

Why this is an HTML-to-PDF whole-class problem

Every HTML-to-PDF API has the same architectural property: it accepts attacker-controlled HTML and feeds it to a full browser. A full browser, by design, fetches from any URL its host network stack can reach. Without explicit filtering, the attack surface includes every destination that surface covers.

The blast radius for an unguarded HTML-to-PDF renderer is roughly the same across vendors:

Attack vector	Example target	What an attacker reads
Cloud metadata endpoint (IPv4)	`http://169.254.169.254/latest/meta-data/iam/security-credentials/`	AWS temporary IAM credentials
Cloud metadata endpoint (GCP)	`http://metadata.google.internal/computeMetadata/v1/` (Metadata-Flavor header required)	GCP service-account tokens
Cloud metadata endpoint (Azure)	`http://169.254.169.254/metadata/instance?api-version=2021-02-01` (Metadata header required)	Azure managed identity tokens
Loopback service	`http://127.0.0.1:8080/admin`, `http://localhost:6379`	Internal admin UIs, Redis, Memcached, debug ports
RFC1918 private range	`http://10.0.0.42`, `http://172.16.0.5`, `http://192.168.1.100`	Internal services on the VPC
`file://` scheme	`file:///etc/passwd`, `file:///app/.env`	Local files readable by the renderer process
IPv6 link-local	`http://[fe80::1]/`	Adjacent IPv6 hosts on the link
IPv4 loopback alias	`http://0.0.0.0:port`, `http://[::1]:port`	Loopback bypass for naive blocklists

The AWS Instance Metadata Service documentation is explicit that IMDSv1 is unauthenticated, and that hardening guidance specifically calls out SSRF as the typical exploit path. The OWASP SSRF Prevention Cheat Sheet lists the same attack vectors and the same blocklists, written long before this specific CVE landed. RFC 1918 (datatracker.ietf.org/doc/html/rfc1918) defines the private address space that needs to be in any sane blocklist.

The structural property is the part that matters: in every one of those rows, the renderer process is the one issuing the fetch. From the perspective of the destination service, the request looks like it came from the trusted server, with the trusted server's IAM role, on the trusted server's internal network. SSRF turns the renderer into a confused deputy.

How to test if your HTML-to-PDF pipeline is vulnerable

The fastest way to know is to try the attack against your own service in a controlled environment. Two test payloads are enough to cover the redirect-style and iframe-style variants.

Test 1: JavaScript redirect to AWS metadata (most common probe).

<!DOCTYPE html>
<html>
<body>
  <p>Render started.</p>
  <script>
    setTimeout(function () {
      window.location.href =
        'http://169.254.169.254/latest/meta-data/';
    }, 200);
  </script>
</body>
</html>

Submit that HTML to your /render endpoint and inspect the resulting PDF. A vulnerable pipeline returns a PDF containing the metadata service's directory listing (ami-id, hostname, iam/, etc.). A safe pipeline returns either a PDF showing "Render started." with no metadata content, an explicit error referring to a blocked destination, or a timeout. If you receive the metadata listing, the pipeline is exposed. Note that on non-AWS hosts the endpoint is unreachable, so this test gives a clean negative on hosts outside AWS even when the pipeline is unsafe; use the loopback variant below as a second check.

Test 2: Iframe to loopback (works on any host with a loopback service).

<!DOCTYPE html>
<html>
<body>
  <p>Render started.</p>
  <iframe
    src="http://127.0.0.1/"
    width="600"
    height="400"
  ></iframe>
</body>
</html>

Submit and inspect. If the iframe area in the PDF shows a response from any service listening on 127.0.0.1 (even an "It works!" default page, an Nginx welcome screen, or a connection-refused error rendered by Chromium with the destination IP visible), the pipeline allows loopback fetches and is exposed.

Test 3: file:// access to a known file.

<iframe src="file:///etc/hostname" width="400" height="100"></iframe>

If the rendered iframe contains the hostname of the renderer container or VM, local-file SSRF is exploitable. The exact contents that come back depend on Chromium's handling of text/plain in iframes, but anything other than a blank iframe or an error referencing a blocked scheme is a finding.

Run these against staging, not production, and only if you operate the service. Probing third-party HTML-to-PDF endpoints without permission is out of scope for this guide and likely violates the vendor's terms of service.

The defense-in-depth playbook

No single mitigation catches every variant. Layer them, and assume each layer will eventually be bypassed.

Layer	What it catches	What it misses
Chromium request interception via `page.route()`	All in-browser fetches, including redirects and iframes, before the socket opens. Catches `file://`, `data:` top-level, RFC1918 if the handler validates IPs.	Misses if the interception handler trusts hostnames without resolving them (DNS rebinding bypasses it).
Network-layer egress filter (iptables, nftables, namespace)	Any fetch that escapes Chromium request interception. Stops the renderer process from opening sockets to RFC1918 and link-local destinations at the kernel.	Misses `file://` access entirely (no socket involved). Misses fetches to public IPs that the renderer should not be reaching.
DNS resolver hardening	DNS rebinding attacks. Resolve once at intercept time, validate the IP set, pin the resolution for the lifetime of the fetch.	Misses anything that bypasses your resolver (Chromium has its own DNS in some configurations; pin via `--host-resolver-rules` or run the renderer in a namespace with a controlled resolver).
Container isolation (network namespace, seccomp, read-only fs)	Lateral movement after an initial fetch succeeds. Limits what credentials the renderer process has to begin with.	Misses the first read of any destination already on the allowlist. Defense-in-depth, not a primary control.
Disabling `file://` and risky schemes at the browser	Local-file SSRF specifically.	Misses every network-based SSRF. Needs the other layers.

The Playwright pattern for the first layer, request interception, is documented in Playwright's network handling guide. The minimal handler looks like this:

import { chromium } from "playwright";
import { isIP } from "net";
import dns from "dns/promises";
 
const BLOCKED_SCHEMES = ["file:", "data:"];
const PRIVATE_CIDRS = [
  /^10\./,
  /^172\.(1[6-9]|2\d|3[01])\./,
  /^192\.168\./,
  /^127\./,
  /^169\.254\./,
  /^::1$/,
  /^fe80:/,
];
 
async function isPrivate(host: string): Promise<boolean> {
  // If it's already a literal IP, check directly
  if (isIP(host)) return PRIVATE_CIDRS.some((re) => re.test(host));
  // Otherwise resolve and check every record
  const records = await dns.resolve(host).catch(() => []);
  return records.some((ip) =>
    PRIVATE_CIDRS.some((re) => re.test(ip)),
  );
}
 
const browser = await chromium.launch();
const page = await browser.newPage();
 
await page.route("**/*", async (route) => {
  const url = new URL(route.request().url());
  if (BLOCKED_SCHEMES.includes(url.protocol)) {
    return route.abort("blockedbyclient");
  }
  if (await isPrivate(url.hostname)) {
    return route.abort("blockedbyclient");
  }
  return route.continue();
});

That handler is the floor, not the ceiling. It does not pin the resolved IP, so a determined attacker with DNS rebinding can still slip through. The next section covers the pin.

Network-layer egress filtering is the second layer. A renderer container running with a dedicated network namespace and an iptables egress allowlist is materially harder to attack than the same renderer with default routing. The basic pattern, applied at container startup:

# Drop all egress by default
iptables -P OUTPUT DROP
 
# Allow loopback only for the renderer's own internal IPC
iptables -A OUTPUT -o lo -j ACCEPT
 
# Allow only the explicit destinations the renderer needs
# (DNS resolver, font CDN, telemetry endpoint, ...)
iptables -A OUTPUT -d 8.8.8.8 -p udp --dport 53 -j ACCEPT
iptables -A OUTPUT -d <fonts CDN IPs> -p tcp --dport 443 -j ACCEPT
 
# Explicitly drop the cloud metadata endpoint as a belt-and-braces rule
iptables -A OUTPUT -d 169.254.169.254 -j DROP
iptables -A OUTPUT -d 169.254.0.0/16 -j DROP
iptables -A OUTPUT -d 10.0.0.0/8 -j DROP
iptables -A OUTPUT -d 172.16.0.0/12 -j DROP
iptables -A OUTPUT -d 192.168.0.0/16 -j DROP

The allowlist on lines 7-9 is the part that takes work to get right: the renderer needs to fetch fonts, possibly external images, and sometimes external CSS. Each of those endpoints needs an explicit hole in the egress filter. The drops on lines 12-16 are insurance for the case where the allowlist is broader than intended.

DNS rebinding is the bypass to watch

Request interception based on hostname only is not enough. An attacker who controls a domain can serve different DNS responses to consecutive queries: the first response returns a public IP that passes your interception check, the second response (seconds later, when Chromium actually fetches) returns 10.0.0.42. The blocklist never matched because the IP it saw was different from the IP that got fetched.

The defense is to make the IP your blocklist validates the same IP the fetch uses. Three ways to do it:

Resolve in the interception handler and rewrite the URL. Inside page.route(), resolve the hostname yourself, validate every resolved A and AAAA record against the private-address blocklist, and rewrite the request URL to use the resolved IP literal. The fetch then connects to a fixed IP, not a hostname Chromium is free to re-resolve.

await page.route("**/*", async (route) => {
  const req = route.request();
  const url = new URL(req.url());
 
  if (BLOCKED_SCHEMES.includes(url.protocol)) {
    return route.abort("blockedbyclient");
  }
 
  // Resolve once, validate, and pin
  const ips = await dns.resolve(url.hostname).catch(() => null);
  if (!ips || ips.length === 0) {
    return route.abort("blockedbyclient");
  }
  if (ips.some((ip) => isPrivateIp(ip))) {
    return route.abort("blockedbyclient");
  }
 
  // Pin the request to the first validated public IP
  const pinned = `${url.protocol}//${ips[0]}${url.pathname}${url.search}`;
  return route.continue({
    url: pinned,
    headers: { ...req.headers(), host: url.hostname },
  });
});

The Host header preservation matters: many target services route based on the original hostname, so the request still needs to look right at the application layer even though it connects to a pinned IP.

Run the renderer behind a forward proxy that pins DNS. A small proxy in front of Chromium (squid, or a custom Node proxy) does the resolve-and-pin once, and Chromium fetches through it. Chromium never sees the original hostname for the network layer.

Use Chromium's --host-resolver-rules flag to override DNS resolution for specific hostnames inside the renderer. Useful for tests, less useful for general SSRF defense because it requires knowing the hostname set in advance.

All three patterns share the same invariant: the IP your check validates is the IP your fetch uses. Anything else leaves a window open.

How PDF4.dev handles this

PDF4.dev intercepts every Chromium request via page.route(), rejects file:// and data: top-level navigations, blocks all RFC1918 ranges plus 169.254.0.0/16 and fe80::/10 at the interception layer, runs each renderer container in a network namespace with a whitelist-only egress, and resolves hostnames once at intercept time with the resolved IP pinned for the fetch. Managed APIs in this category should ship these defaults; raw Playwright, Puppeteer, and self-hosted Gotenberg do not, and the operator is responsible for adding them.

Vendor due diligence question: ask your HTML-to-PDF provider whether their renderer blocks file://, RFC1918, and cloud metadata endpoints by default, and how they handle DNS rebinding. A clear written answer takes them ten minutes. If they cannot answer, assume the answer is no.

Frequently asked questions

The FAQs above (mirrored in this article's structured data) cover the most common follow-up questions: Playwright versus Puppeteer scope, why browsers resolve private IPs at all, the limits of blocking 169.254.169.254 alone, defeating DNS rebinding, evaluating managed PDF APIs, the hardened-render pattern, and whether file:// needs an explicit block even with network filters.

The durable takeaway for developers shipping HTML-to-PDF pipelines: every renderer is a full browser, every browser fetches from any URL by default, and SSRF in this category is a whole-class problem that needs a layered fix. CVE-2026-44439 is the latest single point on a long curve; the architectural response is the same one the OWASP cheat sheet has recommended for years, applied at the request-interception layer that Playwright, Puppeteer, and raw CDP all expose.

Start generating PDFs

Build PDF templates with a visual editor. Render them via API from any language in ~300ms.

Get Started free API Docs

News

CVE-2026-2441: your Playwright HTML-to-PDF pipeline is still vulnerable

CVE-2026-2441 is an actively exploited Chromium CSS use-after-free, fixed in Chrome 145.0.7632.75. Playwright bundles older Chromium for weeks. Detect, patch, harden.

May 22, 202610 min read

News

When your invoice PDF executes shell commands: prompt injection defense

Microsoft confirmed RCE chains from PDF prompt injection on May 7, 2026 (CVE-2026-25592, CVE-2026-26030). Concrete defenses for agent pipelines that ingest user-uploaded PDFs.

May 20, 202616 min read

News

CVE-2026-5287: a developer's guide to the Chromium PDF use-after-free

CVE-2026-5287 is a high-severity use-after-free in Chromium's PDF engine, fixed in Chrome 146.0.7680.178. Detect, patch, and harden Puppeteer, Playwright, Docker, and Lambda.

May 1, 202612 min read

What CVE-2026-44439 actually does

Why this is an HTML-to-PDF whole-class problem

How to test if your HTML-to-PDF pipeline is vulnerable

The defense-in-depth playbook

DNS rebinding is the bypass to watch

How PDF4.dev handles this

Frequently asked questions

Start generating PDFs

Related Articles

CVE-2026-2441: your Playwright HTML-to-PDF pipeline is still vulnerable

When your invoice PDF executes shell commands: prompt injection defense

CVE-2026-5287: a developer's guide to the Chromium PDF use-after-free