Editing PDF metadata with Nutrient Python SDK
PDF metadata — title, author, creation date, and custom properties — is crucial for document management, search functionality, and compliance. Being able to programmatically edit this metadata enables organizations to standardize document properties across large repositories and maintain consistent, searchable records.
How Nutrient helps you achieve this
Nutrient Python SDK handles PDF metadata manipulation. With the SDK, you don’t need to worry about:
- Parsing PDF internal structures
- Managing XMP schemas
- Handling metadata encoding
- Complex XML manipulation
Instead, Nutrient provides an API that handles all the complexity behind the scenes, letting you focus on your business logic.
Complete implementation
Below is a complete working example that demonstrates editing PDF metadata. These lines set up the Python application. The import statements bring in all necessary classes from the Nutrient SDK:
from nutrient_sdk import Documentfrom nutrient_sdk import PdfEditorfrom nutrient_sdk import NutrientExceptionThis line opens the PDF file. The context manager(opens in a new tab) syntax ensures the document is automatically closed when you’re done, preventing resource leaks:
def main(): try: with Document.open("input.pdf") as document:Here, you create a PdfEditor instance that will enable you to manipulate the document. This editor provides all the methods needed for metadata manipulation:
editor = PdfEditor.edit(document)This block updates the metadata fields. The SDK provides access to standard PDF metadata properties like author, title, subject, and keywords:
editor.metadata.set_author("New author value") editor.metadata.set_title("New title value")This block exports the XMP metadata as XML. This is useful for inspection, backup, or integration with other systems:
xmp_metadata = editor.metadata.get_xmp()
with open("output_metadata.xml", "w", encoding="utf-8") as f: f.write(xmp_metadata)This block saves the PDF with the updated properties and closes the editor. The try-except block handles potential errors using NutrientException:
editor.save_as("output.pdf") editor.close() print("Successfully edited metadata") except NutrientException as e: print(f"Error: {e}")
if __name__ == "__main__": main()Conclusion
The metadata editing logic consists of four steps:
- Open the document.
- Create an editor.
- Modify metadata properties.
- Save the result.
Nutrient handles PDF internal structures and XMP schema management so you don’t need to parse PDF internals or manipulate XML directly.
You can download this ready-to-use sample package that’s fully configured to help you get started with the Python SDK.