Class TextParser

A class to query the text on a page. Text lines end with CRLF (\r\n).

Inheritance
System.Object
TextParser
Namespace: PSPDFKit.Pdf
Assembly: PSPDFKit.dll
Syntax
public sealed class TextParser

Methods

GetGlyphsAsync()

Gets the glyphs on page in reading order.

Declaration
public IAsyncOperation<IList<Glyph>> GetGlyphsAsync()
Returns
Type Description
Windows.Foundation.IAsyncOperation<System.Collections.Generic.IList<Glyph>>

A List of Glyphs.

GetTextAsync()

Gets the text on page.

Declaration
public IAsyncOperation<IList<TextBlock>> GetTextAsync()
Returns
Type Description
Windows.Foundation.IAsyncOperation<System.Collections.Generic.IList<TextBlock>>

A List of TextBlocks.

GetTextBlocksForRectsAsync(IEnumerable<Rect>)

Gets a TextBlocks for each Rect on the page.

Declaration
public IAsyncOperation<IList<TextBlock>> GetTextBlocksForRectsAsync(IEnumerable<Rect> rects)
Parameters
Type Name Description
System.Collections.Generic.IEnumerable<Rect> rects

The rect that bounds the text on the page.

Returns
Type Description
Windows.Foundation.IAsyncOperation<System.Collections.Generic.IList<TextBlock>>

GetTextForRectsAsync(IEnumerable<Rect>)

Gets the text bounded by the union of the supplied Rects.

Declaration
public IAsyncOperation<string> GetTextForRectsAsync(IEnumerable<Rect> rects)
Parameters
Type Name Description
System.Collections.Generic.IEnumerable<Rect> rects

The rects that bound the text on the page.

Returns
Type Description
Windows.Foundation.IAsyncOperation<System.String>

WordsFromGlyphs(IList<Glyph>, Boolean)

Calculate Words from a list of Glyphs.

Declaration
public static IList<Word> WordsFromGlyphs(IList<Glyph> glyphs, bool trimPunctuation)
Parameters
Type Name Description
System.Collections.Generic.IList<Glyph> glyphs

The glyphs to convert to Words.

System.Boolean trimPunctuation

Trim leading and trailing punctuation glyphs.

Returns
Type Description
System.Collections.Generic.IList<Word>

A list of Words.