Extract Metadata from PDFs on iOS
PDFs can contain metadata both in a document’s information dictionary and in an XMP stream. With PSPDFKit, you can access these two sources of metadata programmatically using PDFMetadata
and XMPMetadata
, respectively. This guide covers extracting metadata. To modify metadata, please see our separate guide on editing metadata.
Dictionary-Based Metadata
Use PDFMetadata
to work with the dictionary-based metadata in a PDF.
All values specified in the PDF Info Dictionary are represented by the following types:
Swift | Objective-C |
---|---|
String |
NSString |
Int , Float , Double , Bool |
NSNumber |
Date |
NSDate |
Array<Any> |
NSArray<id> |
Dictionary<String, Any> |
NSDictionary<NSString*, id> |
ℹ️ Note: The
Any
andid
types above can include any of the types mentioned.
These types can be combined in any way you see fit and will be converted into the proper PDF types.
By default, the dictionary metadata may contain the following info keys:
-
Author
-
CreationDate
-
Creator
-
Keywords
-
ModDate
-
Producer
-
Title
You can, of course, add any supported key-value dictionary to the metadata.
To get an entry of the metadata dictionary (e.g. the Author
), you can use the following code snippet:
let document = ... let pdfMetadata = PDFMetadata(document: document) let author = pdfMetadata?.object(forInfoDictionaryKey: .author)
PSPDFDocument *document = ... PSPDFDocumentPDFMetadata *pdfMetadata = [[PSPDFDocumentPDFMetadata alloc] initWithDocument:document]; NSString *author = [pdfMetadata objectForInfoDictionaryKey:PSPDFMetadataAuthorKey];
XMP Metadata
Use XMPMetadata
to work with the metadata stream containing XMP data.
Each key in the XMP metadata stream has to have a namespace set. You can define your own namespace or use one of the already existing ones. PSPDFKit exposes two constants for common namespaces:
-
PSPDFXMPPDFNamespace
/PSPDFXMPPDFNamespacePrefix
— the XMP PDF namespace created by Adobe §3.1 -
PSPDFXMPDCNamespace
/PSPDFXMPDCNamespacePrefix
— the Dublin Core namespace
Use the following code snippet to get an object from the XMP metadata:
let xmpMetadata = XMPMetadata(document: document) let documentFormat = xmpMetadata?.string(forXMPKey: "format", namespace: PSPDFXMPDCNamespace)
PSPDFDocumentXMPMetadata *xmpMetadata = [[PSPDFDocumentXMPMetadata alloc] initWithDocument:document];
NSString *documentFormat = [xmpMetadata stringForXMPKey:@"format" namespace:PSPDFXMPDCNamespace];