Extract the text position from PDFs on iOS

Nutrient’s TextParser API exposes various helpers and data structures for working with text. These include information about the location of a given text element at different granularities — at the glyph, word, or text block level.

To get a general overview of the available text APIs, check out the parsing guide.

All main Nutrient text classes expose a frame property that can be used to query the location of a given text element on a PDF page.

Property	Description
`Glyph.frame`	Location of a single character (glyph, quad) on the PDF page.
`Word.frame`	Location of a single word (multiple glyphs) on the PDF page.
`TextBlock.frame`	Location of a text block (e.g. a column of text) on the PDF page.

Those properties return coordinates in normalized PDF coordinates. To learn more about coordinate spaces and how to convert them, see the Coordinate Space Conversions guide.

Here’s an example that will output the individual positions for all words on the first page of a document:

let document = ...

guard let parser = document.textParserForPage(at: 0) else {
    print("Parsing failed.")
    return controller
}

parser.words.forEach { word in
    print("The location of \(word.stringValue) is \(word.frame)")
}

Extract the text position from PDFs on iOS

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.