Documentation
ParseJet Documentation
ParseJet extracts text from any file or URL. One API call handles PDF, DOCX, YouTube, web pages, images, audio, video, and 25+ more formats.
Quick Start
Get your first parse result in under 60 seconds. No signup required.
Try it instantly
Paste any URL into ParseJet โ no API key needed for your first 3 requests per day.
Get your API key
Sign in with Google or GitHub to get a free API key. Free tier includes 300 requests per month.
Use the result
Every response returns the same JSON structure regardless of input format:
Authentication
ParseJet offers three levels of access. You can start using the API immediately without any authentication.
Tip: You don't need an API key to get started. Just send requests directly โ the first 3 per day are free with no signup.
Core Concepts
Supported formats
ParseJet auto-detects the format from the file extension or URL pattern. You don't need to specify the format โ just send the file or URL to /v1/parse/auto and ParseJet handles the rest.
Credits
Each API request consumes credits based on the complexity of the format being parsed. Simple text files cost 1 credit, while YouTube transcripts cost 5. Your monthly credit allowance depends on your plan.
Output format
By default, ParseJet returns raw extracted text. Add ?output_format=markdown to any request to get post-processed output with detected headings, lists, tables, and code blocks.
Guide
Parse a PDF
Extract text from any PDF file, including scanned documents and multi-page reports.
Upload a PDF file
Convert to Markdown
Add output_format=markdown to preserve document structure:
Credit cost: 3 credits per PDF. Supports files up to your plan's file size limit (10MB-200MB).
Guide
YouTube Transcripts
Get the full transcript of any YouTube video. Supports auto-generated captions in 100+ languages.
Get a transcript
Specify language
Use the language parameter for non-English videos:
Or use auto-detect
The /v1/parse/auto/url endpoint automatically detects YouTube URLs:
Credit cost: 5 credits per YouTube video. Metadata includes video_id, channel, and duration.
Guide
Web Scraping
Extract the main content from any web page. ParseJet automatically removes navigation, ads, sidebars, and boilerplate.
Credit cost: 3 credits per web page. Returns clean text with title and source URL in metadata.
Guide
Office Documents
Parse Word (DOCX), Excel (XLSX), PowerPoint (PPTX), and CSV files. Just upload the file โ ParseJet detects the format automatically.
Credit cost: 2 credits per document. Supported: DOCX, PPTX, XLSX, CSV.
API Reference
Response Format
All endpoints return the same JSON structure:
/v1/parse/auto
The recommended endpoint. Auto-detects format from file extension or URL type. Accepts file (multipart) or url (form field), not both.
/v1/parse/auto/url
Parse any URL. Automatically distinguishes YouTube from regular web pages.
/v1/parse/auto/file
Parse any uploaded file. Detects format from file extension, falls back to content-based detection.
/v1/parse/webpage
Extract main content from a web page. Removes navigation, ads, and boilerplate.
/v1/parse/youtube
Extract transcript from a YouTube video. Metadata includes video_id, channel, and duration.
/v1/parse/audio
Parse audio files. Supports MP3, WAV, M4A, OGG, FLAC, WebM. Max 25MB.
/v1/parse/video
Extract audio from video for transcription. Supports MP4, MKV, AVI, MOV, WebM.
/v1/parse/epub
Parse EPUB ebook. Extracts text organized by chapters.
/v1/parse/feed
Parse RSS or Atom feed. Also supports OPML via /v1/parse/opml.
/v1/parse/image
Analyze image. Supports JPG, PNG, GIF, BMP, WebP, TIFF. Max 20MB.
/v1/parse/image/ocr
Extract text from image via OCR.
SDKs
Official SDKs
TypeScript / JavaScript
Python
AI Agents
MCP Server
Use ParseJet as an MCP (Model Context Protocol) server with Claude Code, Cursor, or any MCP-compatible AI agent.
Install
Claude Code
Add to your project's .claude/settings.json:
Cursor
Go to Settings โ MCP Servers, add a new server:
Claude.ai (Remote)
For Claude.ai web, use the remote HTTP endpoint โ no local install needed:
Go to Claude.ai โ Settings โ Integrations โ Add MCP Server โ Enter the URL above.
Available tools
Rate Limits & Pricing
ParseJet uses a credit-based system. Each request consumes credits based on the format complexity.
Response headers include X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and Retry-After on 429 responses.
Error Codes
All errors return JSON with error and message fields.