What is PDF conversion?

PDF conversion is the process of transforming a PDF file into a different format (such as JPG, PNG, Word, or plain text) or converting a different format (such as HTML or an image) into PDF. Different conversion directions require different tools and techniques.

What is the best free PDF converter?

For browser-based, privacy-safe conversion, PDF4.dev's free tools at pdf4.dev/tools process files entirely in your browser without uploading them to any server. For command-line use, Poppler's pdftoppm (for PDF to images) and LibreOffice (for PDF to Word) are reliable open-source options.

Is it safe to convert PDFs online?

Only if the tool processes files locally in your browser. The FBI warned in 2025 about criminals using fake online converters to deliver malware. PDF4.dev tools run 100% client-side using WebAssembly and the Web PDF API: your files never leave your device.

How do I convert a PDF to an editable Word document?

Use LibreOffice (free, open-source) with the command: libreoffice --headless --convert-to docx file.pdf. For programmatic use, the Adobe PDF Services API or Microsoft Graph API provide higher-quality text extraction and layout preservation.

What is the best resolution (DPI) for PDF to image conversion?

150 DPI is the best default for screen use and sharing. Use 300 DPI for professional printing. Going from 150 to 300 DPI roughly quadruples the file size because both width and height double.

Can I convert a password-protected PDF?

Not without the password. You must unlock the PDF first using a tool like pdf4.dev/tools/unlock-pdf, then convert it. Most conversion tools (including Poppler's pdftoppm) will silently fail or produce a blank output when given an encrypted PDF.

What is the difference between PDF to JPG and PDF to PNG?

JPG uses lossy compression (smaller files, slight quality loss), while PNG uses lossless compression (larger files, pixel-perfect quality). JPG is better for photos and sharing. PNG is better for text, diagrams, and anything requiring transparency.

How do I convert HTML to PDF programmatically?

The most reliable method is to use a headless browser (Chromium via Playwright or Puppeteer) to render the HTML and export it as PDF. For production use, an API like pdf4.dev handles the browser infrastructure, scaling, and PDF format options for you.

Can I convert multiple PDFs at once (batch conversion)?

Yes. Command-line tools (pdftoppm, LibreOffice) support batch conversion via shell loops. For programmatic batch conversion, send parallel HTTP requests to a PDF API: most rate limits allow 10-50 concurrent requests.

Does PDF conversion preserve text and fonts?

It depends on the conversion direction. Converting PDF to image (JPG, PNG) always produces raster output with no selectable text. Converting PDF to Word attempts to extract text and approximate layout, but complex formatting (columns, tables, custom fonts) often requires manual cleanup.

PDF Conversion

Complete guide to PDF conversion: every format, every method (2026)

PDF conversion explained: convert PDF to JPG, PNG, Word, HTML and more. Covers free browser tools, Python, Node.js, and command-line methods. Updated 2026.

benoitdedMarch 12, 202614 min read

On this page

PDF conversion: direction and format overview
How to convert PDF to JPG
Browser tool (no upload required)
Python with pdf2image
Node.js with pdfjs-dist
Command line with Poppler (pdftoppm)
JPEG quality and DPI reference
How to convert PDF to PNG
When to choose PNG over JPG
Python
Command line
How to convert PDF to Word (DOCX)
LibreOffice (free, open-source)
Python with python-docx and pdfplumber
Microsoft Graph API (programmatic, layout-aware)
How to convert HTML to PDF
Why HTML and CSS are the best PDF template language
Method 1: PDF4.dev API (production-ready)
Method 2: Playwright (self-hosted)
CSS for PDF: page control
How to convert images to PDF
Browser tool
Python with Pillow and reportlab
Command line with ImageMagick
Batch PDF conversion
Batch PDF to JPG (shell)
Batch HTML to PDF (API)
PDF conversion format comparison
Choosing the right tool
Privacy and security considerations
PDF conversion performance benchmarks
Common PDF conversion errors
"Error: no such file or directory" (Poppler not installed)
Blank output pages
Blurry images
Missing fonts in Word output
Page breaks in wrong places (HTML to PDF)
Summary

PDF conversion is the process of transforming a PDF file into another format (JPG, PNG, Word, HTML) or converting another format into PDF. It is one of the most common document processing tasks, with over 290 billion PDFs created annually and constant demand to exchange content between formats. This guide covers every major conversion direction with working code examples, recommended tools, and practical parameters for each use case.

PDF conversion: direction and format overview

The conversion direction determines which tools and techniques apply. There is no universal "convert PDF" tool because the underlying requirements differ completely.

Conversion	Common use cases	Key constraint
PDF to JPG	Thumbnails, email, social media	Raster output, no text layer
PDF to PNG	Diagrams, transparency, archives	Larger files than JPG
PDF to Word (DOCX)	Editing existing documents	Layout fidelity varies
PDF to plain text	Data extraction, indexing	Loses all formatting
HTML to PDF	Document generation, invoices	Requires headless browser
Image to PDF	Packaging files, scanning	Rasterized PDF (no text layer)
PDF to PDF/A	Long-term archiving	ISO 19005 compliance

The most common use cases are PDF to image (for previewing and sharing) and HTML to PDF (for programmatically generating documents). We cover both in depth below.

How to convert PDF to JPG

Converting a PDF to JPEG produces one image file per page. JPG is the right format when file size matters: email attachments, CMS thumbnails, social media previews.

Browser tool (no upload required)

The fastest approach for one-off conversions:

Open PDF to JPG
Drop your PDF onto the upload zone
Choose quality: Low (60%), High (85%), or Maximum (100%)
Download each page as a JPG

The tool runs entirely in your browser using pdfjs-dist: your files are never uploaded to any server. This matters in 2026: the FBI issued a warning in March 2025 about criminals distributing malware through fake online converters.

Python with pdf2image

pdf2image wraps Poppler and is the most widely used Python library for PDF-to-image conversion:

pip install pdf2image
# Requires Poppler: brew install poppler (macOS) or apt install poppler-utils (Ubuntu)

from pdf2image import convert_from_path
 
images = convert_from_path("document.pdf", dpi=150, fmt="jpeg")
 
for i, img in enumerate(images):
    img.save(f"page_{i + 1}.jpg", "JPEG", quality=85)
 
print(f"Converted {len(images)} pages")

Node.js with pdfjs-dist

PDF.js (Mozilla's open-source PDF engine) works in Node.js via the pdfjs-dist package:

npm install pdfjs-dist canvas

import { getDocument } from "pdfjs-dist/legacy/build/pdf.mjs";
import { createCanvas } from "canvas";
import fs from "fs";
 
async function pdfToJpg(pdfPath, dpi = 150, quality = 85) {
  const scale = dpi / 72; // PDF uses 72 points per inch (ISO 32000)
  const data = new Uint8Array(fs.readFileSync(pdfPath));
  const pdf = await getDocument({ data }).promise;
 
  for (let i = 1; i <= pdf.numPages; i++) {
    const page = await pdf.getPage(i);
    const viewport = page.getViewport({ scale });
    const canvas = createCanvas(viewport.width, viewport.height);
    await page.render({ canvasContext: canvas.getContext("2d"), viewport }).promise;
    fs.writeFileSync(`page_${i}.jpg`, canvas.toBuffer("image/jpeg", { quality: quality / 100 }));
  }
}
 
await pdfToJpg("document.pdf");

Command line with Poppler (pdftoppm)

# All pages, 150 DPI, JPEG at quality 85
pdftoppm -jpeg -jpegopt quality=85 -r 150 document.pdf page
 
# Output: page-1.jpg, page-2.jpg, ...

JPEG quality and DPI reference

JPEG quality settings are not standardized across tools, but the underlying trade-off is consistent: higher quality means larger files. At 85%, the quality difference from 100% is nearly invisible to the human eye for most documents, but the file is roughly 3-5x smaller.

DPI	A4 output size	Best for
72	595 x 842 px	Web thumbnails (small)
150	1240 x 1754 px	General use, screen, presentations
300	2480 x 3508 px	Professional print output

See the full walkthrough in our PDF to JPG guide.

How to convert PDF to PNG

PNG conversion follows the same process as JPG but uses lossless compression. This produces larger files but preserves every pixel exactly, with support for transparency (alpha channel). Use PNG when converting text-heavy PDFs, technical diagrams, or any content you plan to edit further.

When to choose PNG over JPG

	JPG	PNG
Compression	Lossy	Lossless
Transparency	Not supported	Supported
Text legibility	Slight artifacts at low quality	Pixel-perfect
File size (A4 at 150 DPI)	~400-800 KB	~1-3 MB

Choose JPG for photos, gradients, and social media sharing. Choose PNG for diagrams, screenshots, text documents, and anything requiring a transparent background.

Python

from pdf2image import convert_from_path
 
images = convert_from_path("document.pdf", dpi=150, fmt="png")
 
for i, img in enumerate(images):
    img.save(f"page_{i + 1}.png", "PNG")

Command line

# PNG output via pdftoppm
pdftoppm -png -r 150 document.pdf page
# Output: page-1.png, page-2.png, ...

For the full PNG conversion guide including browser tool, see how to convert PDF to PNG.

How to convert PDF to Word (DOCX)

Converting PDF to an editable Word document is the most technically challenging PDF conversion direction. PDFs store content as positioned elements (text runs, images, paths), not as structured paragraphs. Extracting this into Word's OOXML format requires heuristic reconstruction of layout, paragraphs, tables, and fonts.

Limitation: no tool perfectly converts all PDFs. Complex layouts with columns, tables, custom fonts, and embedded images typically require manual cleanup after conversion.

LibreOffice (free, open-source)

LibreOffice is the most capable free option for PDF to DOCX conversion:

# Install LibreOffice
# macOS: brew install --cask libreoffice
# Ubuntu: apt install libreoffice
 
# Convert PDF to DOCX
libreoffice --headless --convert-to docx --outdir ./output document.pdf

Batch conversion of a folder:

libreoffice --headless --convert-to docx --outdir ./output *.pdf

Python with python-docx and pdfplumber

For text extraction without layout reconstruction, pdfplumber provides reliable character-level text extraction:

pip install pdfplumber python-docx

import pdfplumber
from docx import Document
 
def pdf_to_docx(pdf_path, output_path):
    doc = Document()
 
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            text = page.extract_text()
            if text:
                doc.add_paragraph(text)
            doc.add_page_break()
 
    doc.save(output_path)
 
pdf_to_docx("document.pdf", "output.docx")

This approach extracts raw text without formatting. For layout-preserving conversion, LibreOffice or commercial APIs (Adobe PDF Services, Microsoft Graph) produce significantly better results.

Microsoft Graph API (programmatic, layout-aware)

For applications that need high-fidelity PDF to Word conversion in production, the Microsoft Graph API converts PDFs via OneDrive:

// Requires Microsoft 365 tenant and Azure app registration
const response = await fetch(
  `https://graph.microsoft.com/v1.0/me/drive/items/${itemId}/content?format=docx`,
  { headers: { Authorization: `Bearer ${accessToken}` } }
);
const docxBuffer = await response.arrayBuffer();

How to convert HTML to PDF

HTML to PDF is the reverse direction: generating a PDF document from web content. This is the standard method for programmatic document generation (invoices, reports, certificates, contracts).

Why HTML and CSS are the best PDF template language

HTML and CSS give you complete control over page layout, typography, and visual design. Headless Chromium renders HTML with the same CSS engine as Chrome, producing pixel-perfect output. The alternative approaches (LaTeX, PDF libraries like ReportLab or jsPDF) require learning proprietary APIs and lack the ecosystem of web design tools.

Method 1: PDF4.dev API (production-ready)

PDF4.dev is an HTML-to-PDF API that handles the headless browser infrastructure for you. Create a template in the dashboard with Handlebars variables, then call the API with your data:

const response = await fetch("https://pdf4.dev/api/v1/render", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.PDF4_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    template_id: "invoice",
    data: {
      company_name: "Acme Corp",
      invoice_number: "INV-2026-001",
      total: "$1,500.00",
    },
  }),
});
 
const pdfBuffer = await response.arrayBuffer();

You can also pass raw HTML directly without a template:

const response = await fetch("https://pdf4.dev/api/v1/render", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.PDF4_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    html: "<h1>Hello World</h1><p>This is a PDF generated from HTML.</p>",
    format: { preset: "a4" },
  }),
});

Try the free HTML to PDF converter to generate a PDF from any HTML snippet without an API key.

Method 2: Playwright (self-hosted)

Playwright is Microsoft's headless browser library. It supports PDF generation directly via page.pdf():

npm install playwright
npx playwright install chromium

import { chromium } from "playwright";
 
async function htmlToPdf(html, outputPath) {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.setContent(html, { waitUntil: "load" });
  await page.pdf({
    path: outputPath,
    format: "A4",
    margin: { top: "20mm", bottom: "20mm", left: "15mm", right: "15mm" },
    printBackground: true,
  });
  await browser.close();
}
 
await htmlToPdf("<h1>Invoice</h1>", "invoice.pdf");

The self-hosting trade-off: Playwright works well for development and low-volume use. At scale, you face Docker image bloat (Chromium adds ~300 MB), concurrency limits (each render needs a browser page), cold start issues in serverless environments, and occasional browser crashes under memory pressure. An API like PDF4.dev uses the same Chromium engine but manages the infrastructure layer for you.

For a detailed comparison and production considerations, see our Node.js PDF generation guide.

CSS for PDF: page control

When generating PDFs from HTML, use CSS @page and @media print to control pagination:

@page {
  size: A4;
  margin: 20mm 15mm;
}
 
/* Force a page break before an element */
.page-break {
  page-break-before: always;
}
 
/* Prevent breaks inside an element */
.keep-together {
  page-break-inside: avoid;
}
 
/* Repeat table headers across pages */
thead {
  display: table-header-group;
}

How to convert images to PDF

Converting images (JPG, PNG, TIFF) to PDF packages them into a single document for sharing, archiving, or printing. The resulting PDF is a rasterized document (images only, no text layer).

Browser tool

The Image to PDF converter accepts JPG, PNG, and WebP files, allows reordering, and outputs a single merged PDF without any server upload.

Python with Pillow and reportlab

pip install pillow reportlab

from PIL import Image
from reportlab.lib.pagesizes import A4
from reportlab.platypus import SimpleDocTemplate, Image as RLImage
import os
 
def images_to_pdf(image_paths, output_path):
    doc = SimpleDocTemplate(output_path, pagesize=A4)
    page_w, page_h = A4
    story = []
 
    for img_path in image_paths:
        img = Image.open(img_path)
        img_w, img_h = img.size
        # Scale to fit A4 width with margins
        scale = (page_w - 40) / img_w
        story.append(RLImage(img_path, width=img_w * scale, height=img_h * scale))
 
    doc.build(story)
 
images_to_pdf(["scan_1.jpg", "scan_2.jpg", "scan_3.jpg"], "scans.pdf")

Command line with ImageMagick

# Combine multiple images into one PDF
convert scan_1.jpg scan_2.jpg scan_3.jpg output.pdf
 
# With DPI metadata preserved
convert -density 300 scan_1.jpg scan_2.jpg output.pdf

For the detailed guide including file reordering and multi-format support, see how to convert images to PDF.

Batch PDF conversion

For converting multiple files at once, command-line tools are the most efficient approach.

Batch PDF to JPG (shell)

# Convert all PDFs in the current directory to JPG pages
for pdf in *.pdf; do
  name="${pdf%.pdf}"
  pdftoppm -jpeg -jpegopt quality=85 -r 150 "$pdf" "${name}_page"
done

Batch HTML to PDF (API)

When generating multiple PDFs (invoices, reports, certificates), send parallel requests to the PDF4.dev API. Most API rate limits allow 10-50 concurrent requests:

const documents = [
  { template_id: "invoice", data: { invoice_number: "INV-001", total: "$100" } },
  { template_id: "invoice", data: { invoice_number: "INV-002", total: "$200" } },
  { template_id: "invoice", data: { invoice_number: "INV-003", total: "$300" } },
];
 
const results = await Promise.all(
  documents.map((doc) =>
    fetch("https://pdf4.dev/api/v1/render", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.PDF4_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify(doc),
    }).then((r) => r.arrayBuffer())
  )
);

PDF conversion format comparison

Format	Quality	File size	Text selectable	Best for
JPG	Lossy (adjustable)	Small	No	Photos, sharing, thumbnails
PNG	Lossless	Medium	No	Diagrams, text, transparency
TIFF	Lossless	Large	No	Archival, professional scanning
DOCX	Variable	Small	Yes (extracted)	Editing existing documents
Plain text	N/A	Tiny	Yes	Data extraction, indexing
PDF/A	Lossless	Medium	Yes	Long-term archiving (ISO 19005)
HTML	N/A	Variable	Yes	Web display, templating

Choosing the right tool

The right conversion tool depends on your volume, environment, and privacy requirements.

Scenario	Recommended tool	Why
One-off conversion, privacy required	PDF4.dev browser tools	Client-side, no upload
Scripting and automation	pdftoppm / LibreOffice CLI	Free, reliable, batch-capable
Production application (PDF generation)	PDF4.dev API	Managed browser, scales automatically
Python data pipeline	pdf2image + pdfplumber	Pythonic, well-maintained
Node.js application	pdfjs-dist or PDF4.dev SDK	Same engine as browsers
Maximum Word fidelity	LibreOffice or Microsoft Graph	Best layout reconstruction

For document generation (HTML to PDF), using an API is almost always the better choice over self-hosting Playwright in production. The infrastructure overhead (Docker, concurrency, browser crashes, memory) compounds quickly as volume grows.

Privacy and security considerations

PDF conversion tools have become a significant security vector. The FBI and CISA both issued warnings in 2025 about malware distributed via fake online converters, specifically targeting file conversion search queries. Malwarebytes identified specific converter domains distributing ransomware through Google Ads.

For sensitive documents (contracts, financial records, medical files), use:

Local tools: command-line tools (pdftoppm, LibreOffice) running on your own machine
Client-side browser tools: tools that use WebAssembly and the Web PDF API so files never leave your device. All PDF4.dev tools work this way
Trusted API providers: for programmatic use, verify that the API does not retain PDF content after processing

Avoid uploading sensitive documents to unknown online converters. When in doubt, run a quick scan of any downloaded conversion software before installing it.

PDF conversion performance benchmarks

Method	10-page PDF to JPG	Notes
pdftoppm (CLI)	~0.5-2s	Fastest, C-based Poppler
pdf2image (Python)	~1-3s	Wraps pdftoppm
pdfjs-dist (Node.js)	~3-8s	JavaScript rendering
Browser tool	~5-15s	Includes UI overhead
PDF4.dev API (HTML to PDF)	~200-500ms	Warm browser pool

These are approximate ranges for a typical text-heavy A4 document on modern hardware. Actual performance varies significantly by document complexity, image content, and available CPU.

Common PDF conversion errors

"Error: no such file or directory" (Poppler not installed)

pdftoppm requires Poppler to be installed. On macOS: brew install poppler. On Ubuntu: apt install poppler-utils.

Blank output pages

Usually caused by an encrypted PDF. Run the file through PDF unlock first, then retry the conversion.

Blurry images

Low DPI setting. Increase from 72 DPI (the PDF native resolution) to 150 DPI for general use or 300 DPI for print quality.

Missing fonts in Word output

LibreOffice substitutes fonts not installed on the system. For consistent output, install the required fonts before running the conversion, or accept the substitution for body text.

Page breaks in wrong places (HTML to PDF)

Add CSS page-break-inside: avoid to elements that should not be split across pages (tables, cards, section blocks). Use page-break-before: always to force a new page before an element.

Summary

PDF conversion covers two fundamentally different problems: extracting content from existing PDFs (to JPG, PNG, or Word) and generating new PDFs from structured content (HTML to PDF).

For extraction, Poppler's pdftoppm is the fastest and most reliable command-line tool. For generation, headless Chromium (via Playwright or an API) is the standard approach. For one-off conversions without privacy concerns, browser-based tools running locally are the safest and most convenient option.

The free tools on PDF4.dev handle the most common cases entirely in your browser:

PDF to JPG — export each page as JPEG
PDF to PNG — export each page as PNG
HTML to PDF — convert HTML snippets to PDF
Image to PDF — package images into a PDF
Compress PDF — reduce file size after conversion
Merge PDF — combine multiple PDFs into one

For programmatic document generation in production, PDF4.dev's API removes the browser infrastructure layer while giving you full control over templates, variables, and PDF format.

Free tools mentioned:

Pdf To JpgTry it free Pdf To PngTry it free Html To PdfTry it free Image To PdfTry it free Compress PdfTry it free Merge PdfTry it free

Start generating PDFs

Build PDF templates with a visual editor. Render them via API from any language in ~300ms.

Get Started free API Docs

PDF Conversion

How to convert PDF to JPG (free, high quality)

Convert PDF to JPG online for free without uploading files. Also covers Python, Node.js, and command line methods with JPEG quality and DPI tips.

Mar 9, 20268 min read

PDF Conversion

How to convert PDF to PNG (high quality, free)

Convert PDF pages to high-resolution PNG images online, with Python, Node.js, or command line. Includes DPI tips for print-ready output.

Feb 4, 20267 min read

Tutorials

How to convert images to PDF (free, no upload needed)

Turn JPG, PNG, or multiple images into a single PDF in seconds. Free browser-based tool, no sign-up, files never leave your device.

Mar 12, 20269 min read

Start generating PDFs

Related Articles

How to convert PDF to JPG (free, high quality)

How to convert PDF to PNG (high quality, free)

How to convert images to PDF (free, no upload needed)