Coordinate spaces
All spatial data in the JSON response — element bounds, word-level bounds, table cell bounds, and page.width / page.height — shares a single coordinate system per page.
Axes and origin
The origin sits at the top-left corner of the page. The X axis increases to the right and the Y axis increases downward.
This matches the convention used by most graphics APIs and UI frameworks, so coordinates can be used directly when drawing overlays on a rendered page.
Bounding boxes
Every bounds object has four fields.
| Field | Meaning |
|---|---|
x | Left edge (distance from page left) |
y | Top edge (distance from page top) |
width | Horizontal extent |
height | Vertical extent |
The bottom-right corner of an element is (x + width, y + height). All bounds fall within the page canvas — 0 ≤ x + width ≤ page.width and 0 ≤ y + height ≤ page.height.
Word-level bounds (when includeWords is true) and table cell bounds use the same coordinate space as their parent element.
Units
All spatial coordinates are expressed in render-space pixels, regardless of input type.
| Input type | Unit | How it works |
|---|---|---|
| Pixels | Pages are rendered internally at the extraction DPI. Bounds and page dimensions use that render canvas. | |
| Office (DOCX, XLSX, PPTX, etc.) | Pixels | Documents are converted and rendered before extraction. Coordinates use that render canvas. |
| Images (PNG, JPEG, TIFF, etc.) | Pixels | Images are processed at native resolution. Page dimensions equal the image’s pixel dimensions. |
The key contract is that bounds, nested word/cell bounds, and page.width/page.height are always in the same coordinate space. You can scale from that page canvas to any display or downstream coordinate system.
Mapping to a rendered page
Because element bounds and page dimensions share the same coordinate space, you can transform both with the same scale factor to map coordinates into any target space — a rendered image, a browser canvas, or a UI component.
Scale factor
Compute a single scale factor from the page dimensions and the target dimensions:
scale = target_width / page.widthThen multiply every coordinate by this scale:
target_x = bounds.x × scaletarget_y = bounds.y × scaletarget_width = bounds.width × scaletarget_height = bounds.height × scaleThis works because the API guarantees that elements, words, and cells all sit in the same coordinate system as the page.
Example: Mapping to a display canvas
The API returns a US Letter PDF with page.width = 1700, page.height = 2200 (render-space pixels). Suppose you want to display the page in an 850-pixel-wide container:
scale = 850 / 1700 = 0.5An element with bounds: { x: 200, y: 400, width: 556, height: 97 } maps to:
display_x = 200 × 0.5 = 100 pxdisplay_y = 400 × 0.5 = 200 pxdisplay_width = 556 × 0.5 = 278 pxdisplay_height = 97 × 0.5 = 49 pxdef to_display_coords(bounds, page, display_width): """Map API bounds to display coordinates.""" scale = display_width / page["width"] return { "x": bounds["x"] * scale, "y": bounds["y"] * scale, "width": bounds["width"] * scale, "height": bounds["height"] * scale, }
# API returns render-space pixels; display at 850 px wide.page = {"width": 1700, "height": 2200}bounds = {"x": 200, "y": 400, "width": 556, "height": 97}print(to_display_coords(bounds, page, display_width=850))# {'x': 100.0, 'y': 200.0, 'width': 278.0, 'height': 48.5}function toDisplayCoords(bounds, page, displayWidth) { const scale = displayWidth / page.width; return { x: bounds.x * scale, y: bounds.y * scale, width: bounds.width * scale, height: bounds.height * scale, };}
// API returns render-space pixels; display at 850 px wide.const page = { width: 1700, height: 2200 };const bounds = { x: 200, y: 400, width: 556, height: 97 };console.log(toDisplayCoords(bounds, page, 850));// { x: 100, y: 200, width: 278, height: 48.5 }Example: Images at native resolution
For image inputs, the page dimensions equal the image’s native pixel dimensions. If you display the image at its original size, the bounds map directly with no transformation. If you resize the image, apply the same scale factor approach:
scale = display_width / page.widthExample: Drawing overlays on a browser canvas
When rendering a page in a browser at an arbitrary size, you can use the same approach:
function drawElementOverlay(ctx, element, page, canvasWidth) { const scale = canvasWidth / page.width; const { x, y, width, height } = element.bounds;
ctx.strokeStyle = "rgba(255, 0, 0, 0.5)"; ctx.lineWidth = 2; ctx.strokeRect(x * scale, y * scale, width * scale, height * scale);}Converting between coordinate spaces
You can chain transformations to go between any two coordinate spaces by converting through the API’s page coordinate space as an intermediate step.
For example, to convert from a display position back to API coordinates (useful for hit testing — checking which element a user clicked on):
def from_display_coords(display_x, display_y, page, display_width): """Convert display coordinates back to API coordinates.""" scale = page["width"] / display_width return { "x": display_x * scale, "y": display_y * scale, }
# User clicks at pixel (100, 200) on a 850 px wide display# of a page with API dimensions 1700 × 2200.page = {"width": 1700, "height": 2200}api_point = from_display_coords(100, 200, page, display_width=850)print(api_point)# {'x': 200.0, 'y': 400.0}function fromDisplayCoords(displayX, displayY, page, displayWidth) { const scale = page.width / displayWidth; return { x: displayX * scale, y: displayY * scale, };}
// User clicks at pixel (100, 200) on a 850 px wide display// of a page with API dimensions 1700 × 2200.const page = { width: 1700, height: 2200 };console.log(fromDisplayCoords(100, 200, page, 850));// { x: 200, y: 400 }To check if a point falls inside an element’s bounds:
def contains(bounds, x, y): """Check whether a point (in API coordinates) falls inside bounds.""" return ( bounds["x"] <= x <= bounds["x"] + bounds["width"] and bounds["y"] <= y <= bounds["y"] + bounds["height"] )function contains(bounds, x, y) { return ( x >= bounds.x && x <= bounds.x + bounds.width && y >= bounds.y && y <= bounds.y + bounds.height );}