AI & PDF

How to connect ChatGPT to a PDF API

Three ways to connect ChatGPT to a PDF generation API: Custom GPT actions, the experimental MCP server support, and the Assistants API with function calling. Full OpenAPI snippet, auth setup, and signed URL delivery.

benoitdedMay 13, 202615 min read

On this page

The three integration paths at a glance
Path 1: Custom GPT with an OpenAPI Action
Step 1: Create the Custom GPT
Step 2: Add the OpenAPI Action
Step 3: Add authentication
Step 4: Test the action
Path 2: MCP server connection
Why MCP is interesting for ChatGPT users
Connecting ChatGPT to the PDF4.dev MCP server
What MCP unlocks beyond a Custom GPT
Path 3: Assistants API with function calling
Defining the function
Handling the function call
Why delivery: "url" is the unlock for every path
Common gotchas
Which path should you pick?
FAQ
Can ChatGPT generate PDF files directly?
Does ChatGPT support MCP?
How do I show a generated PDF inside ChatGPT?
Where do I store the API key in a Custom GPT?
Can a Custom GPT call multiple endpoints?
What happens if my PDF is larger than 4 MB in the Assistants API?
Is there a rate limit on Custom GPT actions?
Wire ChatGPT to PDF4.dev

You can connect ChatGPT to a PDF API in three ways: a Custom GPT with an OpenAPI Action, the experimental Model Context Protocol (MCP) server connection, or the Assistants API with function calling. Each path has its own setup, auth model, and tradeoffs. The pattern shared by all three is the same: ChatGPT decides when to call the API, posts the data, and returns a signed download URL to the user because ChatGPT cannot render PDF files inline.

This guide walks through all three paths end to end, with the exact OpenAPI schema for the Custom GPT action, the function definition for the Assistants API, and the MCP setup. It explains why the delivery: "url" mode is the unlock for every approach, how to handle authentication, and what to do when the model picks the wrong template.

The three integration paths at a glance

Each path lives at a different point on the trade-off curve between setup effort, end-user experience, and flexibility.

Path	Setup time	Best for	Auth model	Status
Custom GPT Action	15 minutes	Sharing with ChatGPT users	Bearer token in GPT config	Stable
MCP server	5 minutes	Power users with MCP-enabled clients	OAuth or Bearer in client config	Experimental
Assistants API	1 to 4 hours	Production apps embedding ChatGPT	Server-side API key	Stable

The Custom GPT path is the most common because it requires zero code and ships a working integration to anyone with a ChatGPT Plus subscription. The Assistants API path is the most flexible because you control the entire conversation loop. The MCP path is the most exciting because it shares the same server with Claude, Cursor, and any other MCP-capable client.

Path 1: Custom GPT with an OpenAPI Action

A Custom GPT is a configured version of ChatGPT with a system prompt, knowledge files, and one or more Actions. An Action is an OpenAPI 3.1 schema that ChatGPT reads, understands, and calls when it decides the user's request needs an external tool. For PDF generation, the entire integration is one POST endpoint.

Step 1: Create the Custom GPT

Go to chatgpt.com, click your profile, then "My GPTs", then "Create a GPT". The editor opens with a Configure tab. Fill in the basics: name ("PDF Generator"), description ("Generates branded PDFs from natural language"), and an instructions block that tells the model how to behave. The instructions block is the most important part of the GPT.

You are a PDF generation assistant. When the user asks for a PDF (an
invoice, certificate, receipt, report, letter), call the renderPdf
action with the appropriate template_id and data. After the call
returns, present the download URL as a clickable link in markdown:
"[Download your PDF](url)". Never show the raw JSON response.
 
Available templates:
- invoice: requires customer_name, invoice_number, items[], total
- receipt: requires customer_name, amount, date
- certificate: requires recipient_name, course_name, issue_date
 
If the user does not specify a template, ask which one they want.
If a required field is missing, ask the user for it before calling
the action.

Step 2: Add the OpenAPI Action

Scroll to "Actions" and click "Create new action". Paste the OpenAPI 3.1 schema below. ChatGPT validates it on save and shows you the available operations.

openapi: 3.1.0
info:
  title: PDF4.dev API
  description: Generate PDFs from saved templates with dynamic data.
  version: 1.0.0
servers:
  - url: https://pdf4.dev/api/v1
paths:
  /render:
    post:
      operationId: renderPdf
      summary: Render a PDF from a saved template
      description: >-
        Renders a PDF document from a saved template and dynamic data.
        Returns a signed download URL that expires after 24 hours.
        Use this whenever the user asks for an invoice, receipt,
        certificate, or any other branded document.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required:
                - template_id
                - data
                - delivery
              properties:
                template_id:
                  type: string
                  description: The slug of the saved template (invoice, receipt, certificate).
                data:
                  type: object
                  description: Key-value pairs that fill the template variables.
                  additionalProperties: true
                delivery:
                  type: string
                  enum: [url]
                  description: Always use "url" so ChatGPT receives a download link.
      responses:
        '200':
          description: PDF rendered successfully.
          content:
            application/json:
              schema:
                type: object
                properties:
                  url:
                    type: string
                    description: Signed download URL for the rendered PDF.
                  expires_at:
                    type: string
                    format: date-time
                  size_bytes:
                    type: integer
                  duration_ms:
                    type: integer
components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
security:
  - bearerAuth: []

Two details matter here. The delivery parameter is constrained to the literal string url via the enum keyword. This forces ChatGPT to send delivery: "url" on every call, which keeps the response small (a JSON object with a URL, not a 4 MB base64 string). The description fields are not decoration: the model reads them to decide when and how to call the action, so write them like prompts.

Step 3: Add authentication

In the same editor, click the Authentication button next to the Actions panel. Choose "API Key", auth type "Bearer", and paste your p4_live_... token. Save. The token is stored encrypted on OpenAI's servers and sent in the Authorization header of every action call. End users never see it and cannot extract it.

The API key in a Custom GPT is shared across all users of the GPT. If you publish the GPT publicly, every conversation runs against your account and counts against your quota. Cap your PDF API key with a render-only scope and a usage limit before sharing.

Step 4: Test the action

Save the GPT, then in the preview pane on the right, type a request: "Make me an invoice for Acme Corp, invoice number INV-001, two items: consulting 10 hours at $150 and a software license at $500." ChatGPT recognizes the action, builds the payload, calls the API, and replies with a markdown link to the rendered PDF.

If the call fails, ChatGPT shows the raw error and tries to recover. The most common failure is a missing template variable, which is why the system prompt asks the model to validate fields before calling the action. The second most common failure is wrong auth, which surfaces as a 401 in the action log.

Path 2: MCP server connection

The Model Context Protocol (MCP) is an open standard from Anthropic for connecting AI clients to external tools, resources, and prompts. As of 2026, ChatGPT supports remote MCP servers in the deep research workflow and is rolling out broader developer mode access. PDF4.dev already publishes an MCP server with 14 tools, including render_pdf.

Why MCP is interesting for ChatGPT users

Custom GPT Actions are tied to a single GPT and a single OpenAPI schema. MCP servers expose a richer protocol: tools, resources (read-only documents), and prompts (parameterized templates the client can invoke). The same MCP server works in Claude Desktop, Cursor, Windsurf, ChatGPT (where supported), and any future MCP client without modification.

For PDF generation, this means you configure the connection once and every MCP-enabled tool can render PDFs through it. No more copying OpenAPI schemas between clients.

Connecting ChatGPT to the PDF4.dev MCP server

The connection lives in ChatGPT's developer settings. The exact UI changes as the feature rolls out, but the configuration shape is stable:

{
  "name": "pdf4dev",
  "type": "http",
  "url": "https://pdf4.dev/api/mcp",
  "auth": {
    "type": "bearer",
    "token": "p4_live_YOUR_KEY"
  }
}

Once connected, ChatGPT discovers the available tools automatically. Type a request like "Generate a PDF receipt for $42.00 paid by Jane Doe today" and ChatGPT picks the render_pdf tool, builds the parameters from your message, and shows the resulting download URL.

MCP support in ChatGPT is rolling out gradually and the developer mode UI changes frequently. The PDF4.dev MCP endpoint at https://pdf4.dev/api/mcp is the same one Claude Desktop, Cursor, and Windsurf use, so any progress you make on one client transfers to the others. See the MCP protocol explained article for the full background.

What MCP unlocks beyond a Custom GPT

MCP servers can expose three things: tools (callable functions), resources (read-only documents the model can read into context), and prompts (parameterized prompt templates). PDF4.dev's MCP server publishes:

14 tools: render_pdf, preview_template, list_templates, get_template, create_template, update_template, delete_template, list_components, get_component, create_component, update_component, delete_component, list_logs, get_info
4 resources: quickstart, Handlebars helpers reference, format presets, components syntax
3 prompts: generate-invoice, create-template-from-description, debug-render-error

A Custom GPT can do render_pdf and list_templates. The full MCP server lets the model debug a failed render by reading the resources and following the debug-render-error prompt. That is a much richer agent experience.

Path 3: Assistants API with function calling

The Assistants API is OpenAI's developer-facing platform for building stateful AI applications. You define an assistant with a system prompt and a list of functions (tools), then create threads and runs against it. Function calling works the same way it does in the Chat Completions API: the model emits a JSON payload, your code runs the function, and you submit the result back.

This is the right path when you want to embed ChatGPT-like behavior in your own product instead of pointing users at chatgpt.com.

Defining the function

import OpenAI from 'openai';
const openai = new OpenAI();
 
const assistant = await openai.beta.assistants.create({
  name: 'PDF Generator',
  model: 'gpt-4.1',
  instructions: `You generate branded PDFs from user requests. When the
user asks for a document, call the render_pdf function with the right
template_id and data. After the call returns, present the URL as a
clickable link.`,
  tools: [
    {
      type: 'function',
      function: {
        name: 'render_pdf',
        description: 'Generate a PDF from a saved template. Returns a signed download URL.',
        parameters: {
          type: 'object',
          properties: {
            template_id: {
              type: 'string',
              description: 'Slug of the template (invoice, receipt, certificate).',
            },
            data: {
              type: 'object',
              description: 'Key-value pairs filling the template variables.',
              additionalProperties: true,
            },
          },
          required: ['template_id', 'data'],
        },
      },
    },
  ],
});

Handling the function call

When you create a run and the assistant decides to call render_pdf, the run pauses with status requires_action. Your code reads the tool call, posts to PDF4.dev, and submits the URL back as the tool output.

async function handleRun(threadId: string, runId: string) {
  let run = await openai.beta.threads.runs.retrieve(threadId, runId);
 
  while (run.status === 'requires_action') {
    const calls = run.required_action!.submit_tool_outputs.tool_calls;
    const outputs = [];
 
    for (const call of calls) {
      if (call.function.name === 'render_pdf') {
        const args = JSON.parse(call.function.arguments);
        const r = await fetch('https://pdf4.dev/api/v1/render', {
          method: 'POST',
          headers: {
            Authorization: `Bearer ${process.env.PDF4_API_KEY}`,
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({ ...args, delivery: 'url' }),
        });
        const result = await r.json();
        // Critical: return ONLY the URL, not the full PDF binary
        outputs.push({
          tool_call_id: call.id,
          output: JSON.stringify({ url: result.url, expires_at: result.expires_at }),
        });
      }
    }
 
    run = await openai.beta.threads.runs.submitToolOutputs(threadId, runId, {
      tool_outputs: outputs,
    });
 
    while (run.status === 'in_progress' || run.status === 'queued') {
      await new Promise((r) => setTimeout(r, 500));
      run = await openai.beta.threads.runs.retrieve(threadId, runId);
    }
  }
 
  return run;
}

The critical line is output: JSON.stringify({ url: result.url, expires_at: result.expires_at }). Do not include the full PDF binary or even the full API response. The output goes back into the model's context and a 4 MB base64 string would consume more than a million tokens. Pass the URL string and let your downstream code or the user fetch the actual file.

Why `delivery: "url"` is the unlock for every path

ChatGPT, regardless of integration path, has the same fundamental limitation: it cannot render binary PDF files in its message stream. It can show text, code blocks, markdown, and images. A PDF is none of those.

The pattern that works in every path is the same. Your PDF API generates the file, stores it in temporary storage, signs a URL with a short expiry (24 hours is typical), and returns only the URL string to ChatGPT. The model then formats the URL as a clickable markdown link. The user clicks, the browser downloads the PDF. Total tokens consumed: roughly 50.

Without delivery: "url":

Default mode: returns the binary in the HTTP body. Useless for ChatGPT, which sees raw bytes as garbage.
delivery: "base64": returns a base64 string in JSON. A 1 MB PDF becomes a 1.3 MB base64 string and consumes ~325,000 tokens. Most assistants truncate it.

With delivery: "url":

Returns { "url": "...", "expires_at": "...", "size_bytes": 42511, "duration_ms": 287 }. About 200 characters total, ~50 tokens. The URL is a signed token, so it works without the API key.

PDF4.dev's signed URLs are HMAC-SHA256 tokens keyed on a server secret, valid for 24 hours, served by a public endpoint with no auth required. The user clicks, the file downloads, the URL expires. Clean and stateless.

Common gotchas

These five issues account for most failed integrations.

The model picks the wrong template. Solution: make the operation description more specific, list available templates in the system prompt, and add a "validate before calling" instruction. Custom GPTs in particular need explicit instructions because the model has no built-in knowledge of your template library.

Auth fails on first call. Solution: test the API endpoint with curl before adding it to ChatGPT. A 401 from inside a Custom GPT is harder to debug than a 401 from your terminal.

Rate limits hit during testing. Solution: cap your PDF API key with a low rate limit during development, then raise it for production. Hitting your monthly quota in a debugging session because the model called the action 50 times is a common newbie mistake.

ChatGPT hallucinates fields. If the model invents a field that does not exist in your template, the API returns a 400 with a useful error. Make sure your error responses are descriptive (PDF4.dev returns { error: { type, code, message } }), and the model will read the message and retry with the correct field.

The user cannot find the link. ChatGPT sometimes summarizes the action result instead of showing the link. Add an instruction in the system prompt: "Always show the download URL as a markdown link, never as plain text and never summarized as 'I generated your PDF'."

Which path should you pick?

Use this decision shortcut.

You want the easiest setup and to share the integration with anyone: Custom GPT with an Action.
You want to embed ChatGPT in your own product: Assistants API with function calling.
You want to share the same server across Claude, Cursor, ChatGPT, and other MCP clients: MCP server.
You are building a power-user tool for yourself: MCP server (it composes with everything else).

For most teams, the answer is "Custom GPT first, MCP second, Assistants API only if you are building a product on top of ChatGPT."

FAQ

Can ChatGPT generate PDF files directly?

ChatGPT does not have a built-in PDF generation tool. It can produce text, code, and analysis, but creating a PDF requires calling an external API. The three official paths are Custom GPT Actions (OpenAPI-based), MCP server connections (experimental), and the Assistants API with function calling.

Does ChatGPT support MCP?

MCP support in ChatGPT is rolling out gradually starting in 2025-2026. As of writing, deep research mode supports remote MCP servers and a developer mode is in preview. Custom GPT Actions remain the most stable production path for connecting ChatGPT to external APIs.

How do I show a generated PDF inside ChatGPT?

ChatGPT does not render PDF files inline in chat messages. The pattern is to return a download URL from your API and let ChatGPT display it as a link the user clicks. Use a signed URL with a 24-hour expiry so the link works without exposing your API key.

Where do I store the API key in a Custom GPT?

In the Custom GPT editor, add an Authentication block with type "API Key", auth type "Bearer", and paste your token. The key is stored encrypted on OpenAI's servers and sent in the Authorization header of every action call. End users never see the key.

Can a Custom GPT call multiple endpoints?

Yes. The OpenAPI schema you upload to a Custom GPT can define any number of paths and operations. ChatGPT picks the right endpoint based on the user's request and the operation descriptions in the schema, so write clear summaries.

What happens if my PDF is larger than 4 MB in the Assistants API?

Function call results in the Assistants API are capped at the model's context window. A 4 MB base64 PDF eats roughly 1.3 MB of tokens, which is too much for most workflows. Use the URL delivery mode of your PDF API and return only the URL string from the function.

Is there a rate limit on Custom GPT actions?

OpenAI rate-limits Custom GPT actions per user. The limits are not publicly documented but are generous enough for typical use. Your own PDF API rate limits apply on top, so configure both with the expected usage in mind.

Wire ChatGPT to PDF4.dev

PDF4.dev publishes an OpenAPI spec at https://pdf4.dev/api/v1/openapi.json and an MCP server at https://pdf4.dev/api/mcp. Both work with ChatGPT today (Custom GPT Actions are stable; MCP is rolling out). The /ai-integration page has copy-paste configurations for ChatGPT, Claude Desktop, Cursor, Windsurf, VS Code, and the rest.

Open the AI integration guide

Start generating PDFs

Build PDF templates with a visual editor. Render them via API from any language in ~300ms.

Get Started free API Docs

AI & PDFPillar

What is the Model Context Protocol (MCP) and how to use it for PDF generation

MCP lets AI agents call external tools directly. Learn what MCP is, how the protocol works, and how to connect Claude, ChatGPT, Cursor, or VS Code to a PDF API in under 3 minutes.

Mar 18, 202611 min read

AI & PDF

How to generate PDFs with AI agents using MCP

Connect Claude, ChatGPT, Cursor, or any AI agent to PDF4.dev via MCP and generate PDFs with natural language. Step-by-step setup guide with examples.

Mar 1, 20266 min read

AI & PDF

AI invoice generator: build one with Claude and PDF4.dev in 10 minutes

Generate professional PDF invoices with AI using Claude and PDF4.dev. Full tutorial with Node.js code, Handlebars templates, and MCP integration examples.

Mar 23, 202611 min read

Start generating PDFs

Related Articles

What is the Model Context Protocol (MCP) and how to use it for PDF generation

How to generate PDFs with AI agents using MCP

AI invoice generator: build one with Claude and PDF4.dev in 10 minutes