Do PDF4.dev tools upload my files to a server?

No. 22 of the 24 tools run entirely in your browser using JavaScript. The PDF never leaves your device. The two exceptions are HTML to PDF and Webpage to PDF, which require server-side rendering via Playwright because they need a headless browser engine.

What JavaScript libraries does PDF4.dev use for client-side PDF processing?

Two libraries handle all client-side work. pdf-lib (by Andrew Dillon) manipulates PDF structure directly, including merging, splitting, rotating, adding watermarks, and embedding images. pdfjs-dist (Mozilla's PDF.js) renders PDF pages to canvas for compression, image conversion, and thumbnail generation.

How does client-side PDF compression work without a server?

The compression pipeline renders each PDF page to a canvas element using pdfjs-dist, converts the canvas to a JPEG blob at a configurable quality level (0.35 to 0.85), then embeds the JPEG images into a new PDF using pdf-lib. The result is a smaller file because raster JPEG compression replaces the original vector and image data.

What is the difference between pdf-lib and pdfjs-dist?

pdf-lib reads and writes PDF structure (pages, fonts, images, metadata) without rendering anything visually. pdfjs-dist renders PDF pages to canvas or SVG for display. pdf-lib is for manipulation (merge, split, rotate, watermark). pdfjs-dist is for rendering (compression, image export, thumbnails). Some operations like compression use both.

Can browser-based PDF tools handle large files?

Yes, within browser memory limits. A 100-page PDF typically uses 50-200MB of RAM during processing depending on the operation. pdf-lib is efficient because it works with the PDF binary structure directly, not pixel data. Compression is the most memory-intensive operation because it renders each page to a canvas. Modern browsers handle files up to several hundred pages without issues.

Why not use WebAssembly for PDF processing?

pdf-lib is pure JavaScript and already fast enough for document manipulation (merging 10 files takes under a second). Adding a WASM dependency would increase the bundle size without meaningful performance gains for the operations we support. For rendering, pdfjs-dist already uses optimized JavaScript with Web Workers for parallel processing.

How does the localStorage rate limiting work?

Free users get 3 tool uses per rolling 7-day window. The count and reset timestamp are stored in localStorage under the key pdf4_tool_usage. When the window expires, the count resets. Signed-in users have no limit. This approach requires zero server calls for usage tracking.

Which PDF tools require a server?

Only two. HTML to PDF needs Playwright (headless Chromium) to render HTML into a PDF because browser security prevents client-side page-to-PDF conversion. Webpage to PDF needs the same engine plus network access to fetch the target URL. All other 22 tools run entirely in the browser.

Developer Guides

Why we built PDF tools that run in your browser

PDF4.dev's 24 free tools process files entirely client-side using pdf-lib and pdfjs-dist. No upload, no server, no privacy risk. Here is how the architecture works.

benoitdedApril 6, 202610 min read

PDF4.dev has 24 free PDF tools: compress, merge, split, rotate, watermark, convert to PNG, extract text, and 17 others. 22 of them process files entirely in your browser. The PDF never leaves your device.

This article explains the architecture behind that decision, the two JavaScript libraries that make it possible, and the trade-offs we accepted.

Why client-side?

The conventional approach for online PDF tools is straightforward: the user uploads a file, the server processes it, and returns the result. Smallpdf, iLovePDF, and most competitors work this way. It is simple to build and gives you full control over the processing environment.

The problem is the upload.

Concern	Server-side	Client-side
Privacy	File leaves your device	File stays local
Latency	Upload + process + download	Process only
Server cost	Scales with file size and traffic	Zero
Offline support	No	Possible (with Service Worker)
Large files	Limited by upload timeout	Limited by browser RAM
Complex rendering	Full control	Constrained by browser APIs

For a tool that compresses a 5MB PDF, the upload alone takes 2-5 seconds on a typical connection. The actual compression takes under a second. Client-side processing removes the dominant bottleneck.

The privacy argument is even stronger for business documents. Invoices, contracts, and HR files contain sensitive data. "We don't store your files" is a trust claim. "Your file never leaves your browser" is a verifiable architectural guarantee.

Two libraries, two jobs

All client-side PDF processing in PDF4.dev uses two open-source libraries. They solve different problems and complement each other.

pdf-lib: PDF structure manipulation

pdf-lib (by Andrew Dillon) reads and writes the PDF binary format directly. It can add pages, remove pages, embed images, draw text, modify metadata, and save the result as a new Uint8Array. It does not render anything visually.

Think of pdf-lib as a PDF file editor. It works with the document's internal structure: page trees, content streams, font dictionaries, and image objects.

Used by: merge, split, reorder, rotate, watermark, add page numbers, flatten, delete pages, extract pages, crop, resize, edit metadata, protect, unlock, repair, image-to-PDF

pdfjs-dist: PDF rendering

pdfjs-dist (Mozilla's PDF.js, packaged for npm) renders PDF pages to a <canvas> element. It handles font loading, text extraction, and page layout. It runs a Web Worker for parallel page parsing.

Think of pdfjs-dist as a PDF viewer engine. It turns PDF bytes into pixels.

Used by: compress (render to canvas, then re-embed as JPEG), PDF to PNG, PDF to JPG, PDF to grayscale, PDF to text, thumbnail generation for reorder/delete/extract tools

When both are needed

Some operations require both libraries working together. Compression is the clearest example:

pdfjs-dist renders each page to a <canvas> at a target DPI
The canvas is exported as a JPEG blob at a configurable quality
pdf-lib creates a new PDF and embeds each JPEG as a full-page image

The original PDF might contain vector graphics, embedded fonts, and high-resolution images. The compressed version replaces all of that with a single rasterized JPEG per page. This loses editability but can reduce file size by 60-90% depending on the content.

How compression works under the hood

Compression is the most complex client-side operation. Here is the pipeline:

export async function compressPdf(
  file: File,
  levelOrOpts: CompressionLevel | CompressOptions = "medium"
): Promise<Uint8Array> {
  // 1. Parse options
  const quality = opts.quality;  // 0.85 (light), 0.6 (medium), 0.35 (strong)
  const scale = dpiToScale(opts.dpi || 150);
 
  // 2. Load PDF with pdfjs-dist
  const pdf = await loadPdf(file);
 
  // 3. Create new empty PDF with pdf-lib
  const newDoc = await PDFDocument.create();
 
  for (let i = 1; i <= pdf.numPages; i++) {
    const page = await pdf.getPage(i);
 
    // 4. Render to canvas at target DPI
    const viewport = page.getViewport({ scale });
    const canvas = document.createElement("canvas");
    canvas.width = viewport.width;
    canvas.height = viewport.height;
    const ctx = canvas.getContext("2d");
    await page.render({ canvasContext: ctx, viewport }).promise;
 
    // 5. Optional grayscale conversion
    if (opts.grayscale) {
      const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
      for (let j = 0; j < imageData.data.length; j += 4) {
        const gray = 0.299 * imageData.data[j]
                   + 0.587 * imageData.data[j + 1]
                   + 0.114 * imageData.data[j + 2];
        imageData.data[j] = imageData.data[j + 1] = imageData.data[j + 2] = gray;
      }
      ctx.putImageData(imageData, 0, 0);
    }
 
    // 6. Convert canvas to JPEG blob
    const blob = await new Promise<Blob>(
      (resolve) => canvas.toBlob((b) => resolve(b!), "image/jpeg", quality)
    );
 
    // 7. Embed JPEG in new PDF at original page dimensions
    const jpegBytes = new Uint8Array(await blob.arrayBuffer());
    const jpegImage = await newDoc.embedJpg(jpegBytes);
    const newPage = newDoc.addPage([originalWidth, originalHeight]);
    newPage.drawImage(jpegImage, { x: 0, y: 0, width: originalWidth, height: originalHeight });
  }
 
  return newDoc.save();
}

The three compression levels map to JPEG quality values:

Level	JPEG quality	Typical size reduction	Visual quality
Light	0.85	30-50%	Nearly indistinguishable
Medium	0.60	50-70%	Minor artifacts on close inspection
Strong	0.35	70-90%	Visible artifacts, fine for previews

DPI also affects the output. Lower DPI means fewer pixels, which means smaller files but lower resolution. The default is 150 DPI, which produces readable text and reasonable image quality at about 2x the baseline 72 DPI scale factor.

How merging works

Merging is one of the simplest operations because pdf-lib handles it natively:

export async function mergePdfs(
  files: File[],
  options?: MergeOptions
): Promise<Uint8Array> {
  const merged = await PDFDocument.create();
 
  for (const file of files) {
    const bytes = await file.arrayBuffer();
    const source = await PDFDocument.load(bytes);
    const indices = source.getPageIndices();
    const copiedPages = await merged.copyPages(source, indices);
    for (const page of copiedPages) {
      merged.addPage(page);
    }
  }
 
  return merged.save();
}

PDFDocument.copyPages() handles the hard part: resolving cross-references between pages, fonts, and images. If two source PDFs use the same font, pdf-lib deduplicates it in the output.

PDF4.dev also supports optional bookmarks (PDF outlines) when merging. Each source file becomes a bookmark entry pointing to its first page. This uses pdf-lib's low-level context API to build the outline dictionary manually, because pdf-lib does not have a high-level bookmark API.

How rotation works

Rotation is the simplest operation in the codebase. The entire implementation is 17 lines:

export async function rotatePdf(
  file: File,
  rotation: 90 | 180 | 270
): Promise<Uint8Array> {
  const bytes = await file.arrayBuffer();
  const doc = await PDFDocument.load(bytes);
  const pages = doc.getPages();
  for (const page of pages) {
    page.setRotation(degrees(page.getRotation().angle + rotation));
  }
  return doc.save();
}

No canvas rendering, no pixel manipulation. pdf-lib modifies the /Rotate entry in each page's dictionary. The rotation is additive: if a page is already rotated 90 degrees, rotating it another 90 gives 180. The output file size is nearly identical to the input because the page content is unchanged.

How watermarking works

Watermarking draws text directly onto each page using pdf-lib's drawing API:

page.drawText(text, {
  x,
  y,
  size: fontSize,
  font,
  color: rgb(color.r, color.g, color.b),
  opacity,
  rotate: degrees(effectiveRotation),
});

The tricky part is positioning. A centered, rotated watermark requires trigonometry to calculate where the text origin should be so the visual center lands at the page center:

x = width / 2 - (textWidth / 2) * Math.cos(Math.abs(rad));
y = height / 2 + (textWidth / 2) * Math.sin(Math.abs(rad));

The watermark is a vector text overlay, not a rasterized image. This means it adds only a few hundred bytes to the file size regardless of page dimensions.

Usage tracking without a server

Free users get 3 tool uses per rolling 7-day window. We track this in localStorage, not on the server:

const FREE_LIMIT = 3;
const RESET_INTERVAL = 7 * 24 * 60 * 60 * 1000; // 7 days
 
export function checkUsage(): { allowed: boolean; remaining: number } {
  if (typeof window === "undefined") return { allowed: true, remaining: FREE_LIMIT };
  const raw = localStorage.getItem("pdf4_tool_usage");
  const data = raw ? JSON.parse(raw) : { count: 0, resetAt: Date.now() + RESET_INTERVAL };
 
  if (Date.now() > data.resetAt) {
    return { allowed: true, remaining: FREE_LIMIT };
  }
  return {
    allowed: data.count < FREE_LIMIT,
    remaining: Math.max(0, FREE_LIMIT - data.count),
  };
}

This is trivially bypassable (clear localStorage, open incognito). That is intentional. The limit exists to nudge frequent users toward creating a free account, not to enforce a paywall. Anyone determined to use the tools for free can do so. The tools exist primarily as traffic drivers for the API product, not as a revenue source.

The two server-side exceptions

Two tools require server-side processing:

HTML to PDF converts raw HTML code into a PDF. This needs Playwright (headless Chromium) because browsers do not expose a JavaScript API to print arbitrary HTML content to PDF. The browser's window.print() opens a print dialog but cannot return a buffer programmatically.

Webpage to PDF converts a URL into a PDF. This needs the server to fetch the target page and render it. CORS prevents client-side JavaScript from loading arbitrary external pages into an iframe or canvas.

Both tools hit a POST endpoint on the PDF4.dev server. The HTML/URL is sent, Playwright renders it, and the PDF buffer is returned. These are the only tools where the content leaves the user's browser.

What we gave up

Client-side processing has real limitations:

No OCR: optical character recognition requires ML models that are too large for browser download (50-200MB). Server-side OCR with Tesseract or cloud APIs is the only practical option.
No Word/Excel conversion: parsing .docx or .xlsx requires libraries like LibreOffice that do not run in the browser.
No advanced compression: server-side tools like Ghostscript can recompress individual image streams inside a PDF without rasterizing the entire page. Our canvas-based approach loses vector quality.
Memory limits: a 500-page PDF with high-resolution images can exceed browser memory limits. Server-side processing has more headroom.
No font subsetting: pdf-lib embeds the full Helvetica font for watermarks and page numbers. Server-side tools can subset fonts to save kilobytes.

These trade-offs are acceptable for the tools we offer. For operations that need server muscle (OCR, format conversion), we will add server-side tools in a future phase with clear "processed on server" labels.

The tool stack at a glance

Operation	Library	What happens
Merge	pdf-lib	Copy pages between documents
Split	pdf-lib	Copy page subset to new document
Rotate	pdf-lib	Modify `/Rotate` page dictionary entry
Reorder	pdf-lib + pdfjs	Thumbnails (pdfjs), then copy in new order (pdf-lib)
Compress	pdfjs + pdf-lib	Render to canvas (pdfjs), re-embed as JPEG (pdf-lib)
Watermark	pdf-lib	Draw text with opacity and rotation
Page numbers	pdf-lib	Draw text at calculated positions
PDF to PNG/JPG	pdfjs	Render to canvas, export as blob
PDF to text	pdfjs	`page.getTextContent()` extracts text items
Image to PDF	pdf-lib	Embed JPEG/PNG images as pages
Protect	pdf-encrypt-lite	RC4 encryption (third library)
Metadata	pdf-lib	Modify document info dictionary

All of these run in the main thread except pdfjs-dist's page parsing, which uses a Web Worker for parallel processing.

Start generating PDFs

Build PDF templates with a visual editor. Render them via API from any language in ~300ms.

Get Started free API Docs

Developer Guides

How we render PDFs in under 300ms

Inside PDF4.dev's rendering pipeline. Singleton browser pool, Handlebars compilation cache, the thead/tfoot trick for repeating headers, and page number stamping via pdf-lib.

Apr 7, 202611 min read

PDF ManipulationPillar

The complete guide to PDF manipulation: merge, split, compress, and more

Learn how to merge, split, compress, rotate, reorder, and watermark PDF files. A comprehensive guide covering every PDF manipulation technique with free tools.

Dec 10, 20257 min read

PDF Manipulation

How to compress PDF files without losing quality

Reduce PDF file size by up to 80% while maintaining readable quality. Learn how PDF compression works and the best free methods to shrink your files.

Jan 21, 20266 min read