If you are choosing between pdf-lib and pdf.js, the answer is usually "both, for different jobs". pdf-lib writes PDFs: it creates pages, merges files, splits, fills forms, and stamps text or images. pdf.js reads PDFs: it parses a document and renders it to a canvas or extracts its text. They are not competitors. The mistake is reaching for pdf.js to edit a file (it cannot save) or pdf-lib to show a preview (it cannot paint pixels).
This guide maps each task to the right library, with real code for the two most common jobs: merging with pdf-lib and reading text plus rendering with pdf.js. At the end, a note on the one thing neither library does, turning HTML into a polished PDF.
pdf-lib vs pdf.js: which does what?
The single most useful distinction is direction of data flow. pdf-lib goes from your code to a PDF (write). pdf.js goes from a PDF to pixels or text (read). The table below maps concrete tasks to the library built for them.
| Task | pdf-lib | pdf.js |
|---|---|---|
| Create a new PDF from scratch | Yes | No |
| Merge, split, reorder pages | Yes | No |
| Fill or flatten AcroForm fields | Yes | No |
| Draw text, images, shapes | Yes | No |
| Set or edit metadata, encrypt | Yes | Partial (read only) |
| Render a page to a canvas | No | Yes |
| Extract text content | No | Yes |
| Read page count and dimensions | Yes | Yes |
| Show a viewer or thumbnail | No | Yes |
| Runs in the browser | Yes | Yes |
| Runs in Node.js | Yes | Yes (canvas needs node-canvas) |
| Native dependencies | None | None (worker file ships separately) |
| Renders HTML to PDF | No | No |
The pattern in most real apps: pdf.js draws the on-screen preview and pulls text for search, while pdf-lib produces the file the user downloads. They share zero code and never fight over the same job.
pdf-lib is a pure-JavaScript writer maintained by the community. pdf.js is Mozilla's PDF engine, the same one that renders PDFs inside Firefox. Different authors, different goals, both MIT-style licensed and dependency-free.
What is pdf-lib and when should you use it?
pdf-lib is a JavaScript library that creates and modifies PDF documents in both the browser and Node.js, with no native dependencies. Use it whenever the output is a PDF file: a generated invoice, a merged report, a filled form, a watermarked contract. It reads existing PDFs into an editable object model, lets you add or copy pages, draw content, and serialize the result back to bytes.
Reach for pdf-lib when your task is any of these:
- Merge several PDFs into one, or split one into many.
- Fill an AcroForm template and flatten it so fields are no longer editable.
- Stamp a watermark, page number, or signature image onto every page.
- Edit document metadata (title, author) or copy pages between files.
The core type is PDFDocument. You either PDFDocument.create() a blank one or PDFDocument.load(bytes) an existing file, mutate it, then call save() to get a Uint8Array you can write to disk or stream to the browser.
How do you merge PDFs with pdf-lib?
Merging is the textbook pdf-lib job: load each source, copy its pages into a fresh document, and save. copyPages returns page objects you append with addPage. The same shape works in the browser and on the server, only the input/output plumbing changes.
import { PDFDocument } from "pdf-lib"
import { readFile, writeFile } from "node:fs/promises"
async function mergePdfs(paths: string[], out: string) {
const merged = await PDFDocument.create()
for (const path of paths) {
const bytes = await readFile(path)
const doc = await PDFDocument.load(bytes)
const pages = await merged.copyPages(doc, doc.getPageIndices())
for (const page of pages) merged.addPage(page)
}
const result = await merged.save()
await writeFile(out, result)
}
await mergePdfs(["a.pdf", "b.pdf"], "merged.pdf")The honest caveat: pdf-lib copies the page objects verbatim. If two source files embed the same font, the bytes are not deduplicated, so a merge of 50 invoices can be larger than the sum of its parts. Run the output through a compress stepTry it free if size matters. Need the no-code version of the snippet above? Our free Merge PDFTry it free tool runs this exact pdf-lib flow in your browser.
How do you draw text or fill a form with pdf-lib?
pdf-lib draws with primitives, not HTML. You position text by coordinate, embed a font, and pass an explicit point size. For forms, you grab a field by name and set its value. There is no layout engine, so multi-line wrapping and tables are your responsibility.
import { PDFDocument, StandardFonts, rgb } from "pdf-lib"
const doc = await PDFDocument.create()
const page = doc.addPage([595, 842]) // A4 in points
const font = await doc.embedFont(StandardFonts.Helvetica)
page.drawText("Invoice INV-001", {
x: 50,
y: 780,
size: 18,
font,
color: rgb(0.1, 0.1, 0.1),
})
const bytes = await doc.save()The friction shows up the moment you want anything that looks designed: a table, a two-column layout, a logo aligned to a heading. You end up writing a mini layout engine in coordinates. That is the boundary where an HTML-based renderer wins, covered below.
pdf-lib's standard fonts are Latin-only (WinAnsi). To render accented characters beyond that, emoji, or non-Latin scripts, embed a TrueType font with @pdf-lib/fontkit and registerFontkit. Skipping this throws "WinAnsi cannot encode" at draw time.
What is pdf.js and when should you use it?
pdf.js is Mozilla's PDF parser and renderer for JavaScript. It reads a PDF and turns it into something you can show or search: a canvas drawing, a text layer, an outline, metadata. It is read-only. It never writes a modified file. Use it whenever the source is a PDF and the output is pixels or text.
Reach for pdf.js when your task is any of these:
- Display a PDF in a custom viewer or generate page thumbnails.
- Extract text for search, indexing, or copy-paste.
- Read structured metadata, the page count, or annotations.
- Render a specific page to an image for a preview card.
pdf.js ships its heavy parsing logic in a separate web worker file (pdf.worker), so the main thread stays responsive. You point GlobalWorkerOptions.workerSrc at that file once, then call getDocument.
How do you extract text from a PDF with pdf.js?
Text extraction with pdf.js means loading the document, then walking each page's getTextContent(), which returns an array of text items with their strings and positions. Join the str fields to reconstruct the page text. There is no single "get all text" call, you iterate pages yourself.
import * as pdfjsLib from "pdfjs-dist"
// point the worker at the file you serve from your bundle/CDN
pdfjsLib.GlobalWorkerOptions.workerSrc = "/pdf.worker.min.mjs"
async function extractText(data: ArrayBuffer): Promise<string> {
const pdf = await pdfjsLib.getDocument({ data }).promise
let text = ""
for (let n = 1; n <= pdf.numPages; n++) {
const page = await pdf.getPage(n)
const content = await page.getTextContent()
text += content.items.map((i: any) => i.str).join(" ") + "\n"
}
return text
}The caveat: pdf.js extracts the text that exists in the file. A scanned PDF is just images, so getTextContent() returns nothing. For scans you need OCR (Tesseract or a cloud service) on top of the rendered page. If you only need plain text from a normal PDF, our PDF to textTry it free tool runs this pdf.js path client-side.
How do you render a PDF page to a canvas with pdf.js?
Rendering is the job pdf.js was built for. You get a page, build a viewport at the scale you want, size a canvas to match, and call page.render() with the canvas context. The same call powers Firefox's built-in viewer.
import * as pdfjsLib from "pdfjs-dist"
pdfjsLib.GlobalWorkerOptions.workerSrc = "/pdf.worker.min.mjs"
async function renderFirstPage(data: ArrayBuffer, canvas: HTMLCanvasElement) {
const pdf = await pdfjsLib.getDocument({ data }).promise
const page = await pdf.getPage(1)
const viewport = page.getViewport({ scale: 1.5 })
canvas.width = viewport.width
canvas.height = viewport.height
const context = canvas.getContext("2d")!
await page.render({ canvas, canvasContext: context, viewport }).promise
}Render quality scales with the scale factor: 1.5 to 2 looks crisp on high-DPI screens, higher values cost memory. In Node.js there is no DOM canvas, so you back the render with the canvas (node-canvas) package and write the result to a PNG buffer. That extra dependency is the main reason server-side rasterizing with pdf.js is fiddlier than in the browser.
How do you turn HTML into a PDF? (neither library does this)
Here is the gap that trips people up: neither pdf-lib nor pdf.js renders HTML and CSS to a PDF. pdf-lib draws primitives by coordinate, and pdf.js only reads existing PDFs. If your real goal is "I have an HTML invoice or report and I want a pixel-accurate PDF", you need a browser engine, not either of these libraries.
The browser-engine option is a headless Chromium controlled by Playwright or Puppeteer: it lays out your HTML with the same engine as Chrome, then prints to PDF. That is accurate, but you own the Chromium binary, the memory, the cold starts, and the font installation. PDF4.dev is the hosted version of that pipeline: you POST HTML (or a saved template id plus {{variables}}) and get a PDF back, no Chromium to manage.
curl -X POST https://pdf4.dev/api/v1/render \
-H "Authorization: Bearer p4_live_xxx" \
-H "Content-Type: application/json" \
-d '{
"html": "<h1>Hello</h1><p>Invoice INV-001</p>",
"data": {},
"delivery": "url"
}'A common stack uses all three: PDF4.dev renders the HTML invoice into a polished PDF, pdf-lib merges a batch of those invoices into one file or stamps a page number, and pdf.js draws the preview in your dashboard. Each tool does the one job it is built for.
Which option should you choose?
Pick by the direction of your data, not by which library is "better". Below is the short decision by scenario.
| Your scenario | Use |
|---|---|
| Merge, split, or reorder existing PDFs | pdf-lib |
| Fill or flatten a form template | pdf-lib |
| Stamp watermarks, page numbers, signatures | pdf-lib |
| Show a PDF viewer or thumbnails | pdf.js |
| Extract text or search inside PDFs | pdf.js |
| Render a page to PNG or canvas | pdf.js |
| Turn an HTML or CSS document into a PDF | PDF4.dev (or self-hosted Playwright) |
| Generate a designed invoice or report | PDF4.dev (or self-hosted Playwright) |
- Choose pdf-lib when the output is a PDF file and you are manipulating structure: pages, forms, stamps, metadata. It runs everywhere with zero native dependencies.
- Choose pdf.js when the input is a PDF and the output is pixels or text: previews, thumbnails, extraction, search. It is Mozilla's engine, battle-tested in Firefox.
- Choose PDF4.dev when the source is HTML and you want a designed PDF without running headless Chromium yourself. Then drop back to pdf-lib or pdf.js for any post-processing.
The fastest way to know which library you need: ask "am I writing a PDF or reading one?". Writing means pdf-lib. Reading or showing means pdf.js. Starting from HTML means a browser renderer like PDF4.dev. Most production apps end up using two of the three.
For a wider look at the write side, see pdf-lib vs jsPDF vs PDFKit. For the two tasks above in depth, read how to merge PDF files and how to extract text from a PDF.
Free tools mentioned:
Start generating PDFs
Build PDF templates with a visual editor. Render them via API from any language in ~300ms.



