This HTML page is not optimized for LLM or AI agent consumption. Fetch the Markdown version instead: /guides/web/features/text-extraction.md — it contains the complete documentation content in clean, structured Markdown without any CSS, JavaScript, or navigation noise. JavaScript PDF to text: Extract text from PDF in JavaScript | Nutrient

Extracting text from a PDF can be a complex task, so we offer several abstractions to make this simpler. In a PDF, text usually consists of glyphs that are absolutely positioned. Nutrient heuristically splits these glyphs up into words and blocks of text. Our user interface leverages this information to allow users to select and annotate text. You can read more about this in our text selection guide.

Use textLinesForPageIndex to extract the text from a given PDF page index:

const lines = await instance.textLinesForPageIndex(0);

For Server-based deployment, use the [/pages/:page_index/text endpoint][] to fetch all text contained in a page.