The Data Extraction API accepts PDFs, images, and Office documents. The API automatically detects the file type from the content; you don’t need to specify it explicitly.
PDFs
Extension
MIME type
PDF
application/pdf
Images
Extension
MIME type
PNG
image/png
JPG / JPEG
image/jpeg
TIFF
image/tiff
BMP
image/bmp
GIF
image/gif
WEBP
image/webp
SVG
image/svg+xml
HEIC
image/heic
TGA
image/x-tga
EPS
image/postscript
Office documents
The API supports Word documents, spreadsheets, and presentations across both modern Open XML formats and legacy binary formats.