---
title: "JavaScript PDF parser library | Nutrient SDK"
canonical_url: "https://www.nutrient.io/guides/web/extraction/parse-content/"
md_url: "https://www.nutrient.io/guides/web/extraction/parse-content.md"
last_updated: "2026-06-08T16:48:26.972Z"
description: "Discover how to parse text, annotations, and digital signatures in documents using Nutrient Web SDK for customizable data processing solutions."
---

# JavaScript PDF parser library

Documents can contain a variety of different data in many formats: text, annotations, digital signatures, etc. With Nutrient Web SDK, you can parse that data separately and process it according to your needs.

<!-- Shared between "Extraction -> Read Text" and "Extraction -> Parse Content" -->

Nutrient Web SDK’s [API](https://www.nutrient.io/api/web/) includes a variety of methods to enable access to different types of content from a document.

## Page information

It’s possible to retrieve basic information from a specific page — like page dimensions, rotation, and labels. A call to [`Instance#pageInfoForIndex`](https://www.nutrient.io/api/web/NutrientViewer.Instance.html#pageInfoForIndex) can return that information for you in a [`NutrientViewer.PageInfo`](https://www.nutrient.io/api/web/NutrientViewer.PageInfo.html) object:

```js

const {
	width,
	height,
	index,
	label,
	rotation
} = instance.pageInfoForIndex(0);

```

## Page text

Retrieving the text of a page can be done using [`Instance#textLinesForPageIndex`](https://www.nutrient.io/api/web/NutrientViewer.Instance.html#textLinesForPageIndex), which returns a `Promise` resolving to a [`NutrientViewer.Immutable.List`](https://www.nutrient.io/api/web/NutrientViewer.Immutable.List.html) of [`NutrientViewer.TextLine`](https://www.nutrient.io/api/web/NutrientViewer.TextLine.html). In turn, this can be traversed to parse the content of each line:

```js

// Retrieve and log text lines for page 0.
const textLines = await instance.textLinesForPageIndex(0);
textLines.forEach((textLine, textLineIndex) => {
	console.log(`Content for text line ${textLineIndex}`);
	console.log(`Text: ${textLine.contents}`);
	console.log(`Id: ${textLine.id}`);
	console.log(`Page index: ${textLine.pageIndex}`);
	console.log(`Bounding box: ${JSON.stringify(textLine.boundingBox.toJS())}`);
});

```

## Form fields

It’s possible to retrieve detailed information about each form field in a document with [`Instance#getFormFields`](https://www.nutrient.io/api/web/NutrientViewer.Instance.html#getFormFields):

```js

const formFields = await instance.getFormFields();

```

You can check each form field type’s properties in the [corresponding API reference section](https://www.nutrient.io/api/web/NutrientViewer.FormFields.html).

## Form field values

Similarly to form fields, form field values can be retrieved with [`Instance#getFormFieldValues`](https://www.nutrient.io/api/web/NutrientViewer.Instance.html#getFormFieldValues):

```js

const values = instance.getFormFieldValues();

```

The returned object includes each form field value indexed by the form field name.

## Annotation text

Some annotation types can include text as one of their properties:

```js

// Retrieve annotations from page 0.
const annotations = await instance.getAnnotations(0);
// Retrieves the first text annotation available.
const textAnnotation = annotations.find(annotation => annotation instanceof NutrientViewer.Annotations.TextAnnotation);
// Logs the text of the text annotation.
console.log(textAnnotation.text);

```

Note annotations can also include text as one of their properties.

## Text under an annotation

Markup annotations can be used to highlight or draw attention to some text in the document. That text isn’t part of the annotation’s properties, but it can be obtained by mapping the annotation’s bounding box to the bounding boxes of the text lines of the page.

Nutrient Web SDK makes that operation easy by providing [`Instance#getMarkupAnnotationText`](https://www.nutrient.io/api/web/NutrientViewer.Instance.html#getMarkupAnnotationText) and [`Instance#getTextFromRects`](https://www.nutrient.io/api/web/NutrientViewer.Instance.html#getTextFromRects):

```js

// Retrieve annotations from page 0.
const annotations = await instance.getAnnotations(0);
// Retrieves the first highlight annotation available.
const highlightAnnotation = annotations.find(annotation => annotation instanceof NutrientViewer.Annotations.HighlightAnnotation);
// Logs the text behind the highlight annotation.
console.log(await instance.getMarkupAnnotationText(highlightAnnotation));

```

## Bookmarks

Extracting bookmark information can be done with Nutrient Web SDK’s [`Instancel#getBookmarks`](https://www.nutrient.io/api/web/NutrientViewer.Instance.html#getBookmarks) method:

```js

const bookmarks = await instance.getBookmarks();
bookmarks.forEach(bookmark => {
	console.log(bookmark.toJS());
});

```

## Digital signatures

When your license includes the Digital Signatures component, you can extract digital signature information from any digitally signed document. This can also be done through [`Instance#getSignaturesInfo`](https://www.nutrient.io/api/web/NutrientViewer.Instance.html#getSignaturesInfo), which resolves to a [`NutrientViewer.SignaturesInfo`](https://www.nutrient.io/api/web/NutrientViewer.SignaturesInfo.html) record. This object includes:

- A `signatures` array with individual [`NutrientViewer.SignatureInfo`](https://www.nutrient.io/api/web/NutrientViewer.SignatureInfo.html) data for each signature

- A `status` field with [document validation information](https://www.nutrient.io/api/web/NutrientViewer.html#.DocumentValidationStatus)

```js

const signaturesInfo = await instance.getSignaturesInfo();

```
---

## Related pages

- [JavaScript PDF extraction library](/guides/web/extraction.md)
- [Extract pages from PDFs using JavaScript](/guides/web/extraction/page-extraction.md)
- [Extract selected text from PDFs programmatically](/guides/web/features/text-selection.md)
- [Read text from PDFs using JavaScript](/guides/web/extraction/read-text.md)
- [Extract text from PDFs using JavaScript](/guides/web/features/text-extraction.md)

