This guide explains how to use the PDF-to-PDF/A API to convert PDF documents to ISO-compliant PDF/A format for long-term preservation and archival purposes.

PDF/A

PDF/A is a document format intended for long-term preservation. The PDF/A conversion API supports converting source files into all PDF/A versions and conformance levels:

  • PDF/A-1a, PDF/A-1b
  • PDF/A-2a, PDF/A-2u, PDF/A-2b
  • PDF/A-3a, PDF/A-3u, PDF/A-3b
  • PDF/A-4, PDF/A-4e, PDF/A-4f

For more information on the long-term preservation of documents, refer to our demo video or our complete guide to PDF/A.

Configuring PDF/A conversion

PDF/A documents are intended for long-term preservation, and their structure is different from PDF documents. To ensure compliance with your chosen conformance level, the conversion process may introduce changes to the document’s content or appearance. This might change the document by adding, editing, or removing document structure elements, embedding fonts, etc.

In some cases, direct conversion isn’t possible. The PDF/A conversion API then uses other techniques such as vectorization and rasterization:

  • Vectorization means that if some document elements cannot be used directly in the PDF/A output, they’re embedded in the output document as vector-based graphic elements. This technique is typically used for fonts and paths.
  • Rasterization means that if some document content cannot be used directly in the PDF/A output, it’s embedded in the output document as raster images.

Both approaches result in the loss of fonts and text information because the text is converted into shapes and raster images. Text information can later be recovered using optical character recognition (OCR).

To control whether vectorization or rasterization techniques should be used, set the vectorization and rasterization options to true.

Terminal window
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer your_api_key_here" \
-o result.pdf \
--fail \
-F document=@document.pdf \
-F instructions='{
"parts": [
{
"file": "document"
}
],
"output": {
"type": "pdfa",
"conformance": "pdfa-2a",
"vectorization": true,
"rasterization": true
}
}'