This HTML page is not optimized for LLM or AI agent consumption. Fetch the Markdown version instead: /guides/dws-data-extraction/parsing/coordinate-spaces.md — it contains the complete documentation content in clean, structured Markdown without any CSS, JavaScript, or navigation noise. Coordinate spaces

All spatial data in the JSON response — element bounds, word-level bounds, table cell bounds, and page.width / page.height — shares a single coordinate system per page.

Axes and origin

The origin sits at the top-left corner of the page. The X axis increases to the right and the Y axis increases downward.

Coordinate space diagram showing the origin at the top-left of the page, with X increasing right, Y increasing down, and an element’s bounding box labeled with x, y, width, and height dimensions.

This matches the convention used by most graphics APIs and UI frameworks, so coordinates can be used directly when drawing overlays on a rendered page.

Bounding boxes

Every bounds object has four fields.

FieldMeaning
xLeft edge (distance from page left)
yTop edge (distance from page top)
widthHorizontal extent
heightVertical extent

The bottom-right corner of an element is (x + width, y + height). All bounds fall within the page canvas — 0 ≤ x + width ≤ page.width and 0 ≤ y + height ≤ page.height.

Word-level bounds (when includeWords is true) and table cell bounds use the same coordinate space as their parent element.

Units

All spatial coordinates are expressed in render-space pixels, regardless of input type.

Input typeUnitHow it works
PDFPixelsPages are rendered internally at the extraction DPI. Bounds and page dimensions use that render canvas.
Office (DOCX, XLSX, PPTX, etc.)PixelsDocuments are converted and rendered before extraction. Coordinates use that render canvas.
Images (PNG, JPEG, TIFF, etc.)PixelsImages are processed at native resolution. Page dimensions equal the image’s pixel dimensions.

The key contract is that bounds, nested word/cell bounds, and page.width/page.height are always in the same coordinate space. You can scale from that page canvas to any display or downstream coordinate system.

Mapping to a rendered page

Because element bounds and page dimensions share the same coordinate space, you can transform both with the same scale factor to map coordinates into any target space — a rendered image, a browser canvas, or a UI component.

Scale factor

Compute a single scale factor from the page dimensions and the target dimensions:

scale = target_width / page.width

Then multiply every coordinate by this scale:

target_x = bounds.x × scale
target_y = bounds.y × scale
target_width = bounds.width × scale
target_height = bounds.height × scale

This works because the API guarantees that elements, words, and cells all sit in the same coordinate system as the page.

Example: Mapping to a display canvas

The API returns a US Letter PDF with page.width = 1700, page.height = 2200 (render-space pixels). Suppose you want to display the page in an 850-pixel-wide container:

scale = 850 / 1700 = 0.5

An element with bounds: { x: 200, y: 400, width: 556, height: 97 } maps to:

display_x = 200 × 0.5 = 100 px
display_y = 400 × 0.5 = 200 px
display_width = 556 × 0.5 = 278 px
display_height = 97 × 0.5 = 49 px
def to_display_coords(bounds, page, display_width):
"""Map API bounds to display coordinates."""
scale = display_width / page["width"]
return {
"x": bounds["x"] * scale,
"y": bounds["y"] * scale,
"width": bounds["width"] * scale,
"height": bounds["height"] * scale,
}
# API returns render-space pixels; display at 850 px wide.
page = {"width": 1700, "height": 2200}
bounds = {"x": 200, "y": 400, "width": 556, "height": 97}
print(to_display_coords(bounds, page, display_width=850))
# {'x': 100.0, 'y': 200.0, 'width': 278.0, 'height': 48.5}

Example: Images at native resolution

For image inputs, the page dimensions equal the image’s native pixel dimensions. If you display the image at its original size, the bounds map directly with no transformation. If you resize the image, apply the same scale factor approach:

scale = display_width / page.width

Example: Drawing overlays on a browser canvas

When rendering a page in a browser at an arbitrary size, you can use the same approach:

function drawElementOverlay(ctx, element, page, canvasWidth) {
const scale = canvasWidth / page.width;
const { x, y, width, height } = element.bounds;
ctx.strokeStyle = "rgba(255, 0, 0, 0.5)";
ctx.lineWidth = 2;
ctx.strokeRect(x * scale, y * scale, width * scale, height * scale);
}

Converting between coordinate spaces

You can chain transformations to go between any two coordinate spaces by converting through the API’s page coordinate space as an intermediate step.

For example, to convert from a display position back to API coordinates (useful for hit testing — checking which element a user clicked on):

def from_display_coords(display_x, display_y, page, display_width):
"""Convert display coordinates back to API coordinates."""
scale = page["width"] / display_width
return {
"x": display_x * scale,
"y": display_y * scale,
}
# User clicks at pixel (100, 200) on a 850 px wide display
# of a page with API dimensions 1700 × 2200.
page = {"width": 1700, "height": 2200}
api_point = from_display_coords(100, 200, page, display_width=850)
print(api_point)
# {'x': 200.0, 'y': 400.0}

To check if a point falls inside an element’s bounds:

def contains(bounds, x, y):
"""Check whether a point (in API coordinates) falls inside bounds."""
return (
bounds["x"] <= x <= bounds["x"] + bounds["width"]
and bounds["y"] <= y <= bounds["y"] + bounds["height"]
)