Document Conversion API for AI Agents: No Subscriptions, No API Keys

Most document conversion APIs are priced for humans. You sign up, pick a plan—500 conversions/month for $29, or 2,000 for $79—and hope your actual usage maps onto one of those tiers. If you’re an AI agent doing burst processing (a hundred conversions one afternoon, nothing for two weeks), you’re either overpaying most of the time or hitting rate limits at the worst moment.

The x402 micropayment model changes this cleanly. Pay $0.008 when you call the endpoint. Don’t call it, don’t pay. No account required, no API key to rotate, no monthly commitment.

Here’s how PicoPayd’s document conversion services work in practice.

What’s Supported

DOCX → PDF — The most common conversion. Preserves heading structure, table formatting, inline images, and page breaks. Output is print-ready PDF with optional metadata stripping.

HTML → PDF — Provide a raw HTML string or a publicly accessible URL. The service renders it headlessly (Puppeteer on Cloudflare Workers) and returns a PDF. CSS is respected, including external stylesheets if you’re providing a URL. Useful for generating reports, receipts, or archiving web content.

PDF → Markdown — Extracts structured text from PDFs into Markdown format. Handles multi-column layouts, headers, and tables reasonably well. Not perfect for complex layouts, but good enough for most document ingestion pipelines.

Markdown → DOCX/PDF — Useful for agents that generate structured output in Markdown and need to deliver it as a business document.

Plain text extraction — Any document format to plain text. The boring but useful one for agents building RAG pipelines.

Pricing and Economics

Operation	Price per call
DOCX → PDF	$0.008
HTML → PDF	$0.01
PDF → Markdown	$0.008
Markdown → DOCX	$0.007

Compare this to subscription alternatives:

CloudConvert: Starts at $9.99/month for 250 conversions (~$0.04/call)
Zamzar API: $25/month for 100 conversions ($0.25/call)
Adobe PDF Services: $0.05/call (pay-as-you-go)

At $0.008/call, PicoPayd is 5-30x cheaper for agents with variable workloads. And because there’s no subscription overhead, an agent doing 10 conversions/month isn’t paying for 990 it doesn’t use.

Making a Conversion Call

Using the x402 TypeScript SDK:

import { createClient } from '@x402/client'

const client = createClient({
  facilitatorUrl: 'https://x402.org/facilitator',
  wallet: agentWallet, // Your agent's USDC wallet
})

// HTML to PDF
const pdfResponse = await client.fetch(
  'https://api.picopayd.codefission.co.uk/convert/html-to-pdf',
  {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      html: '<html><body><h1>Quarterly Report</h1><p>...</p></body></html>',
      options: {
        format: 'A4',
        margins: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
        includeHeader: false,
      }
    })
  }
)

const { pdf, metadata } = await pdfResponse.json()
// pdf: base64-encoded PDF
// metadata: { pages: 3, fileSize: 48291, processingTime: 420 }

The SDK handles the 402 payment challenge automatically. Your code doesn’t need any payment-specific logic—it just makes a fetch call and receives the response.

For plain curl without the SDK:

# First request returns 402 with payment instructions
curl -X POST https://api.picopayd.codefission.co.uk/convert/html-to-pdf \
  -H "Content-Type: application/json" \
  -d '{"html": "<h1>Test</h1>"}'

# Returns: 402 Payment Required
# X-Payment-Required: {"amount":"0.008","asset":"USDC","network":"base","recipient":"0x..."}

How the HTML→PDF Rendering Works

This is worth explaining because it affects what you can and can’t reliably convert.

The service uses Puppeteer in headless Chrome mode, running on Cloudflare Workers. When you provide a URL, it fetches and renders the page fully—executing JavaScript, loading external CSS, waiting for fonts. When you provide raw HTML, it renders in an isolated context without network access (so external resources won’t load unless you inline them or provide a base URL).

What this means practically:

Modern CSS works: Flexbox, Grid, custom properties, @media print rules are all respected
JavaScript-rendered content: If you’re converting a URL, dynamic content renders. If you’re converting raw HTML, client-side-rendered content won’t be populated unless you pre-render it server-side
External fonts: Provide a URL for external font loading; raw HTML should use system fonts or inline the @font-face declarations
Page breaks: Use page-break-before: always CSS on sections you want on new pages

The rendering happens in roughly 400-800ms depending on page complexity. Heavily dynamic pages with lots of external resources can take up to 2 seconds.

Integrating with AI Agent Frameworks

If you’re building agents with LangChain, CrewAI, or direct tool-calling patterns, document conversion slots in cleanly as a tool definition.

LangChain example:

import { DynamicStructuredTool } from '@langchain/core/tools'
import { z } from 'zod'

const htmlToPdfTool = new DynamicStructuredTool({
  name: 'html_to_pdf',
  description: 'Convert HTML content or a URL to a PDF document',
  schema: z.object({
    html: z.string().optional().describe('Raw HTML string to convert'),
    url: z.string().url().optional().describe('URL of page to convert to PDF'),
    format: z.enum(['A4', 'Letter', 'Legal']).default('A4'),
  }),
  func: async ({ html, url, format }) => {
    const response = await x402Client.fetch(
      'https://api.picopayd.codefission.co.uk/convert/html-to-pdf',
      {
        method: 'POST',
        body: JSON.stringify({ html, url, options: { format } })
      }
    )
    const { pdf, metadata } = await response.json()
    return `PDF generated: ${metadata.pages} pages, ${(metadata.fileSize / 1024).toFixed(1)}KB`
  }
})

The agent can then call this tool when it needs to produce a document artifact—generating a report, archiving a webpage, or converting research output to a shareable format.

Limitations Worth Knowing

I’d rather tell you where this doesn’t work well than have you find out in production.

Large documents are slow. PDFs over ~50 pages or HTML over ~5MB will hit Workers’ CPU time limits and time out. For large batch jobs, you’ll want to split them into chunks.

Password-protected PDFs can’t be converted without decryption. The service doesn’t handle these.

Complex scientific documents with equations (MathML, LaTeX) don’t render well. The HTML renderer handles MathJax if it’s loaded, but it’s not reliable.

Scanned PDFs (images with no text layer) pass through the PDF-to-Markdown conversion with no extracted text. You’d need an OCR service first—which is on our roadmap.

These are real limitations, not edge cases. If your agent is processing large research PDFs with mathematical notation, this service isn’t the right tool. For the common cases—converting generated HTML reports, ingesting business documents, archiving web content—it works cleanly.