Build secure PDF viewers with table extraction functionality
This guide walks you through creating a fullstack PDF viewer application that:
- Uses Nutrient Web SDK for PDF rendering
- Integrates with Nutrient DWS Viewer API for document management
- Implements secure session token authentication
- Supports document upload from URLs or files
- Extracts PDF tables to Excel format using Nutrient DWS Processor API
Nutrient DWS Processor API provides a variety of tools to create efficient document processing workflows in a single API call, and you can try them for free.
Prerequisites
- Nutrient DWS Viewer API key — For document upload, management, and session token generation
- Nutrient DWS Processor API key — For PDF-to-Excel conversion functionality
These are two separate API keys for different services. DWS Viewer API handles document viewing and management, while DWS Processor API provides document processing capabilities such as PDF-to-Excel conversion.
Reference documentation
If you face difficulties in parsing the API reference URL (due to dynamically generated JavaScript), download the YAML file from that URL.
This guide demonstrates the implementation using React + Vite for the frontend and Express.js for the backend, but the same concepts can be adapted to any web framework or technology stack.
Architecture
PDF viewing data flow:
- Document URL/file → Express server
- Express server → DWS Viewer API (uploads document)
- DWS Viewer API → Express server (returns session token)
- Express server → React app (provides session token)
- React app → Nutrient SDK (renders PDF using session token)
Excel export data flow:
- Express server → Processor API (sends PDF for conversion)
- Processor API → Express server → User (downloads Excel file)
Key components:
- Express server — Handles API key storage, document uploads, session token generation, PDF-to-Excel conversion
- React app — Renders PDFs using Nutrient Web SDK with session tokens, triggers Excel export
- DWS Viewer API — Manages documents and provides secure access using JWT tokens
- Processor API — Converts PDF tables to Excel format using
/buildendpoint
Implementation steps
1. Creating a React application
Create a new React + Vite + TypeScript project (or skip to the next step if you have an existing React project):
npm create vite@latest nutrient-pdf-viewer -- --template react-tscd nutrient-pdf-viewernpm install2. Setting up Nutrient Web SDK
Follow the React Vite getting started guide to set up Nutrient Web SDK in your React + Vite project. This guide covers installing dependencies, configuring Vite, setting up CSS, and TypeScript declarations.
3. Installing server dependencies
Install Express.js and required dependencies:
npm install express cors dotenv multer node-fetch form-datanpm install --save-dev @types/express @types/cors @types/multer concurrently4. Creating an environment configuration
Create .env file for API key storage:
NUTRIENT_DWS_VIEWER_API_KEY=your_viewer_api_key_hereNUTRIENT_DWS_PROCESSOR_API_KEY=your_processor_api_key_herePORT=30015. Implementing an Express server
Create server.js with the ES module syntax:
import express from "express";import cors from "cors";import multer from "multer";import fetch from "node-fetch";import dotenv from "dotenv";
dotenv.config();
const app = express();const PORT = process.env.PORT || 3001;const HOUR_IN_SECONDS = 3600;const upload = multer();
app.use( cors({ origin: ["http://localhost:5173", "http://localhost:3001"], credentials: true, }),);app.use(express.json());
// Helper function to create session token for a document.const createSessionToken = async (documentId, apiKey) => { const sessionPayload = { allowed_documents: [ { document_id: documentId, document_permissions: ["read", "write", "download"], }, ], exp: Math.floor(Date.now() / 1000) + HOUR_IN_SECONDS, };
const sessionResponse = await fetch( "https://api.nutrient.io/viewer/sessions", { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify(sessionPayload), }, );
if (!sessionResponse.ok) { const errorText = await sessionResponse.text(); throw new Error( `Session creation failed: ${sessionResponse.statusText} - ${errorText}`, ); }
const sessionResult = await sessionResponse.json(); return sessionResult.jwt;};
// Health check endpoint.app.get("/api/health", (req, res) => { res.json({ status: "ok", message: "Server is running" });});
// Upload document from URL endpoint.app.post("/api/upload-from-url", async (req, res) => { try { const { url } = req.body; const apiKey = process.env.NUTRIENT_DWS_VIEWER_API_KEY;
if (!apiKey) { return res.status(500).json({ success: false, error: "NUTRIENT_DWS_VIEWER_API_KEY environment variable is not set", }); }
if (!url) { return res.status(400).json({ success: false, error: "URL is required", }); }
// Fetch document from URL. const docResponse = await fetch(url); if (!docResponse.ok) { throw new Error( `Failed to fetch document from URL: ${docResponse.statusText}`, ); }
const docBuffer = await docResponse.buffer(); const contentType = docResponse.headers.get("content-type") || "application/pdf";
// Upload to DWS Viewer API using binary upload. const uploadResponse = await fetch( "https://api.nutrient.io/viewer/documents", { method: "POST", headers: { Authorization: `Bearer ${apiKey}`, "Content-Type": contentType, "Content-Length": docBuffer.length.toString(), }, body: docBuffer, }, );
if (!uploadResponse.ok) { const errorText = await uploadResponse.text(); throw new Error( `Upload failed: ${uploadResponse.statusText} - ${errorText}`, ); }
const uploadResult = await uploadResponse.json();
// Extract document ID from nested response. const documentId = uploadResult.data?.document_id || uploadResult.document_id || uploadResult.id;
if (!documentId) { throw new Error("No document ID found in upload response"); }
// Create session token using helper function. const sessionToken = await createSessionToken(documentId, apiKey);
res.json({ success: true, documentId: documentId, sessionToken: sessionToken, title: uploadResult.title || "Document from URL", }); } catch (error) { console.error("Error in upload-from-url:", error); res.status(500).json({ success: false, error: error.message, }); }});
// Upload file endpoint.app.post( "/api/upload-and-create-session", upload.single("file"), async (req, res) => { try { const apiKey = process.env.NUTRIENT_DWS_VIEWER_API_KEY;
if (!apiKey) { return res.status(500).json({ success: false, error: "NUTRIENT_DWS_VIEWER_API_KEY environment variable is not set", }); }
if (!req.file) { return res.status(400).json({ success: false, error: "No file uploaded", }); }
// Upload document using binary upload. const uploadResponse = await fetch( "https://api.nutrient.io/viewer/documents", { method: "POST", headers: { Authorization: `Bearer ${apiKey}`, "Content-Type": req.file.mimetype, "Content-Length": req.file.size.toString(), }, body: req.file.buffer, }, );
if (!uploadResponse.ok) { const errorText = await uploadResponse.text(); throw new Error( `Upload failed: ${uploadResponse.statusText} - ${errorText}`, ); }
const uploadResult = await uploadResponse.json();
// Extract document ID from nested response. const documentId = uploadResult.data?.document_id || uploadResult.document_id || uploadResult.id;
if (!documentId) { throw new Error("No document ID found in upload response"); }
// Generate session token using helper function. const sessionToken = await createSessionToken(documentId, apiKey);
res.json({ success: true, documentId: documentId, sessionToken: sessionToken, title: uploadResult.title || req.file.originalname, }); } catch (error) { console.error("Error in upload-and-create-session:", error); res.status(500).json({ success: false, error: error.message, }); } },);
// Generate session token for existing document.app.post("/api/create-session", async (req, res) => { try { const { documentId } = req.body; const apiKey = process.env.NUTRIENT_DWS_VIEWER_API_KEY;
if (!apiKey) { return res.status(500).json({ success: false, error: "NUTRIENT_DWS_VIEWER_API_KEY environment variable is not set", }); }
if (!documentId) { return res.status(400).json({ success: false, error: "Document ID is required", }); }
// Generate session token using helper function. const sessionToken = await createSessionToken(documentId, apiKey);
res.json({ success: true, sessionToken: sessionToken, }); } catch (error) { console.error("Error in create-session:", error); res.status(500).json({ success: false, error: error.message, }); }});
// Convert PDF to Excel endpoint.app.post("/api/convert-to-excel", async (req, res) => { try { const { url } = req.body; const processorApiKey = process.env.NUTRIENT_DWS_PROCESSOR_API_KEY;
if (!processorApiKey) { return res.status(500).json({ success: false, error: "NUTRIENT_DWS_PROCESSOR_API_KEY environment variable is not set", }); }
if (!url) { return res.status(400).json({ success: false, error: "PDF URL is required", }); }
// Fetch the PDF document. const pdfResponse = await fetch(url); if (!pdfResponse.ok) { throw new Error(`Failed to fetch PDF: ${pdfResponse.statusText}`); }
const pdfBuffer = await pdfResponse.buffer();
// Create `FormData` for the conversion request. const FormData = (await import("form-data")).default; const formData = new FormData();
// Add the PDF file. formData.append("file", pdfBuffer, { filename: "document.pdf", contentType: "application/pdf", });
// Add instructions for Excel conversion. const instructions = { parts: [ { file: "file", }, ], output: { type: "xlsx", }, };
formData.append("instructions", JSON.stringify(instructions));
// Make the conversion request. const conversionResponse = await fetch("https://api.nutrient.io/build", { method: "POST", headers: { Authorization: `Bearer ${processorApiKey}`, ...formData.getHeaders(), }, body: formData, });
if (!conversionResponse.ok) { const errorText = await conversionResponse.text(); throw new Error( `PDF to Excel conversion failed: ${conversionResponse.statusText} - ${errorText}`, ); }
// Get the Excel file as buffer. const excelBuffer = await conversionResponse.buffer();
// Send the Excel file as response. res.setHeader( "Content-Type", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", ); res.setHeader( "Content-Disposition", 'attachment; filename="extracted_tables.xlsx"', ); res.setHeader("Content-Length", excelBuffer.length);
res.send(excelBuffer); } catch (error) { console.error("Error in convert-to-excel:", error); res.status(500).json({ success: false, error: error.message, }); }});
// Document management endpoints for cleanup.app.get("/api/documents", async (req, res) => { try { const apiKey = process.env.NUTRIENT_DWS_VIEWER_API_KEY;
if (!apiKey) { return res.status(500).json({ success: false, error: "NUTRIENT_DWS_VIEWER_API_KEY environment variable is not set", }); }
const response = await fetch("https://api.nutrient.io/viewer/documents", { method: "GET", headers: { Authorization: `Bearer ${apiKey}`, "Content-Type": "application/json", }, });
if (!response.ok) { const errorText = await response.text(); throw new Error( `Failed to fetch documents: ${response.statusText} - ${errorText}`, ); }
const documents = await response.json();
res.json({ success: true, documents: documents.data || documents, total: documents.data?.length || documents.length || 0, }); } catch (error) { console.error("Error fetching documents:", error); res.status(500).json({ success: false, error: error.message, }); }});
app.post("/api/cleanup-documents", async (req, res) => { try { const apiKey = process.env.NUTRIENT_DWS_VIEWER_API_KEY;
if (!apiKey) { return res.status(500).json({ success: false, error: "NUTRIENT_DWS_VIEWER_API_KEY environment variable is not set", }); }
// Get list of documents. const listResponse = await fetch( "https://api.nutrient.io/viewer/documents", { method: "GET", headers: { Authorization: `Bearer ${apiKey}`, "Content-Type": "application/json", }, }, );
if (!listResponse.ok) { const errorText = await listResponse.text(); throw new Error( `Failed to fetch documents: ${listResponse.statusText} - ${errorText}`, ); }
const documentsResult = await listResponse.json(); const documents = documentsResult.data || documentsResult;
if (documents.length <= 5) { return res.json({ success: true, message: `Only ${documents.length} documents found, no cleanup needed`, deleted: [], remaining: documents.length, }); }
// Sort by creation date, keep 5 most recent, delete the rest. const sortedDocs = documents.sort((a, b) => { const dateA = new Date(a.created_at || a.createdAt || a.timestamp || 0); const dateB = new Date(b.created_at || b.createdAt || b.timestamp || 0); return dateB - dateA; });
const docsToDelete = sortedDocs.slice(5); const deleted = [];
// Delete old documents. for (const doc of docsToDelete) { try { const docId = doc.document_id || doc.id; const deleteResponse = await fetch( `https://api.nutrient.io/viewer/documents/${docId}`, { method: "DELETE", headers: { Authorization: `Bearer ${apiKey}`, "Content-Type": "application/json", }, }, );
if (deleteResponse.ok) { deleted.push(docId); } } catch (error) { console.error("Error deleting document:", error); } }
res.json({ success: true, message: `Cleanup completed. Deleted ${deleted.length} documents, kept 5 recent ones.`, deleted: deleted, remaining: 5, totalDeleted: deleted.length, }); } catch (error) { console.error("Error in cleanup-documents:", error); res.status(500).json({ success: false, error: error.message, }); }});
app.listen(PORT, () => { console.log(`Server running on port ${PORT}`); console.log(`Health check: http://localhost:${PORT}/api/health`);});6. Implementing the React PDF viewer component
Update src/App.tsx with DWS Viewer API integration:
import { useEffect, useRef, useState } from 'react'import './App.css'
function App() { const containerRef = useRef<HTMLDivElement>(null) const [status, setStatus] = useState("Initializing...") const [sessionToken, setSessionToken] = useState<string | null>(null) const fileInputRef = useRef<HTMLInputElement>(null)
// Function to upload document from URL and get session token. const uploadFromUrl = async (url: string) => { try { setStatus("Uploading document from URL...")
const response = await fetch('http://localhost:3001/api/upload-from-url', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ url }), })
const result = await response.json()
if (!result.success) { throw new Error(result.error || 'Upload failed') }
setSessionToken(result.sessionToken) return result.sessionToken
} catch (error) { console.error('Upload error:', error) throw error } }
// Function to upload local file and get session token. const uploadFile = async (file: File) => { try { setStatus("Uploading file...")
const formData = new FormData() formData.append('file', file)
const response = await fetch('http://localhost:3001/api/upload-and-create-session', { method: 'POST', body: formData, })
const result = await response.json()
if (!result.success) { throw new Error(result.error || 'Upload failed') }
setSessionToken(result.sessionToken) return result.sessionToken
} catch (error) { console.error('Upload error:', error) throw error } }
// Function to load PDF using session token. const loadPDFWithSession = async (token: string) => { try { const container = containerRef.current
// Load SDK using local installation. const NutrientViewer = (await import("@nutrient-sdk/viewer")).default
// Ensure there's only one `NutrientViewer` instance. NutrientViewer.unload(container)
// Verify container has dimensions. if (!container) { throw new Error("Container ref is not available") }
const rect = container.getBoundingClientRect() if (rect.width === 0 || rect.height === 0) { throw new Error(`Container has no dimensions: ${rect.width}x${rect.height}. Check your CSS.`) }
setStatus("Loading PDF with session token...")
// Load PDF using DWS Viewer API session token. if (container && NutrientViewer) { await NutrientViewer.load({ container, // Use session token instead of document URL for DWS API. session: token, // `baseUrl`: where SDK should load its assets from. baseUrl: `${window.location.protocol}//${window.location.host}/${ import.meta.env.PUBLIC_URL ?? "" }`, }) }
setStatus("PDF loaded successfully via DWS Viewer API!")
return () => { NutrientViewer.unload(container) }
} catch (error) { console.error("PDF loading failed:", error) setStatus(`Error: ${error instanceof Error ? error.message : String(error)}`) throw error } }
// Handle file selection. const handleFileSelect = async (event: React.ChangeEvent<HTMLInputElement>) => { const file = event.target.files?.[0] if (file) { try { const token = await uploadFile(file) await loadPDFWithSession(token) } catch (error) { setStatus(`Error: ${error instanceof Error ? error.message : String(error)}`) } } }
// Function to convert PDF to Excel. const convertToExcel = async () => { try { setStatus("Converting PDF to Excel...")
// Use the document URL (replace with your actual URL) const documentUrl = "YOUR_DOCUMENT_URL_HERE"
const response = await fetch('http://localhost:3001/api/convert-to-excel', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ url: documentUrl }), })
if (!response.ok) { const errorData = await response.json().catch(() => ({ error: 'Conversion failed' })) throw new Error(errorData.error || 'Failed to convert PDF to Excel') }
// Get the Excel file as blob. const excelBlob = await response.blob()
// Create download link. const downloadUrl = window.URL.createObjectURL(excelBlob) const link = document.createElement('a') link.href = downloadUrl link.download = 'extracted_tables.xlsx' document.body.appendChild(link) link.click() document.body.removeChild(link) window.URL.revokeObjectURL(downloadUrl)
setStatus("Excel file downloaded successfully!")
// Reset status after a few seconds. setTimeout(() => { setStatus("PDF loaded successfully via DWS Viewer API!") }, 3000)
} catch (error) { console.error('Excel conversion error:', error) setStatus(`Error converting to Excel: ${error instanceof Error ? error.message : String(error)}`) } }
// Function to clean up old documents. const cleanupDocuments = async () => { try { setStatus("Cleaning up old documents...")
const response = await fetch('http://localhost:3001/api/cleanup-documents', { method: 'POST', headers: { 'Content-Type': 'application/json', }, })
const result = await response.json()
if (!result.success) { throw new Error(result.error || 'Cleanup failed') }
setStatus(`Cleanup successful: ${result.message}`)
// Reset status after a few seconds. setTimeout(() => { setStatus("Ready - documents cleaned up!") }, 4000)
} catch (error) { console.error('Cleanup error:', error) setStatus(`Error cleaning up documents: ${error instanceof Error ? error.message : String(error)}`) } }
// Load document from URL on component mount. useEffect(() => { let cleanup = () => {}
const initializePDF = async () => { try { // Replace with actual document URL when implementing. const documentUrl = "YOUR_DOCUMENT_URL_HERE" const token = await uploadFromUrl(documentUrl) cleanup = await loadPDFWithSession(token)
} catch (error) { console.error("PDF loading failed:", error) setStatus(`Error: ${error instanceof Error ? error.message : String(error)}`) } }
initializePDF()
return cleanup }, [])
return ( <div> <div style={{ padding: "10px", background: "#f0f0f0", borderBottom: "1px solid #ccc", fontSize: "14px", display: "flex", justifyContent: "space-between", alignItems: "center" }}> <span style={{ color: "#000000", backgroundColor: "#ffffff", padding: "4px 8px", borderRadius: "4px", border: "1px solid #ddd", fontWeight: "500" }}> Status: {status} </span> <div style={{ display: "flex", gap: "10px", alignItems: "center" }}> <input ref={fileInputRef} type="file" accept=".pdf,.doc,.docx,.ppt,.pptx,.xls,.xlsx" onChange={handleFileSelect} style={{ display: "none" }} /> <button onClick={() => fileInputRef.current?.click()} style={{ padding: "5px 10px", backgroundColor: "#007bff", color: "white", border: "none", borderRadius: "4px", cursor: "pointer", fontSize: "12px" }} > Upload File </button> <button onClick={convertToExcel} style={{ padding: "5px 10px", backgroundColor: "#28a745", color: "white", border: "none", borderRadius: "4px", cursor: "pointer", fontSize: "12px" }} > Export to Excel </button> <button onClick={cleanupDocuments} style={{ padding: "5px 10px", backgroundColor: "#dc3545", color: "white", border: "none", borderRadius: "4px", cursor: "pointer", fontSize: "12px" }} > Clean Up Documents </button> {sessionToken && ( <span style={{ fontSize: "10px", color: "#666", maxWidth: "200px", overflow: "hidden", textOverflow: "ellipsis" }}> Session: {sessionToken.substring(0, 20)}... </span> )} </div> </div> <div ref={containerRef} style={{ height: "calc(100vh - 60px)", width: "100vw", background: "#e0e0e0" }} /> </div> )}
export default App7. Updating package scripts
Update package.json scripts for running both the server and the client:
{ "scripts": { "dev": "vite", "server": "node server.js", "dev:full": "concurrently \"npm run server\" \"npm run dev\"", "build": "tsc -b && vite build", "lint": "eslint .", "preview": "vite preview" }}8. Running the application
Start both the server and the client:
npm run dev:fullAccess the application at:
- Frontend:
http://localhost:5173 - Backend:
http://localhost:3001
Key implementation notes
DWS Viewer API workflow
- Document upload — The server uploads the document to
https://api.nutrient.io/viewer/documentsusing binary upload. Binary file upload is the most common and straightforward approach for document uploads. DWS Viewer API also supportsmultipart/form-datafor advanced use cases such as attaching XFDF files or specifying custom metadata. - Document ID extraction — Extract the document ID from the nested response structure (
response.data.document_id). - Session token generation — Create a JSON Web Token (JWT) at
https://api.nutrient.io/viewer/sessionswith document permissions. - PDF loading — Use session token (not document URL) in
NutrientViewer.load({ session: token }).
PDF-to-Excel conversion workflow
- PDF fetch — The server fetches a PDF from a URL or uses an uploaded file.
- Form data creation — Then it creates multipart form data with a PDF file and conversion instructions.
- Processor API call — Next, it sends a request to
https://api.nutrient.io/buildwithoutput.type: "xlsx". - Excel download — Finally, it returns an Excel file as a binary stream for client download.
Document management
The application includes document cleanup functionality to manage DWS Viewer API document limits:
- Document listing —
GET /api/documentsfetches all documents from the DWS account. - Document cleanup —
POST /api/cleanup-documentskeeps the five most recent documents and deletes older ones. - Automatic cleanup — Helps avoid
document_limit_reachederrors by managing storage.
Security considerations
- API keys are stored server-side only in environment variables
- Session tokens have expiration times (1 hour default)
- CORS is configured for local development
- No API keys are exposed to client-side code
Error handling
- Comprehensive error messages for debugging
- Status updates throughout the process
- Cleanup of SDK instances
- Container dimension validation
File support
Supported file types:
- PDF (
.pdf) - Word (
.doc,.docx) - PowerPoint (
.ppt,.pptx) - Excel (
.xls,.xlsx) - Other formats supported
Troubleshooting
Common issues
- Container dimension errors — Ensure CSS sets an explicit width/height on the container
- SDK loading failures — Verify assets are copied to the public directory during build
- CORS errors — Configure server CORS for your frontend domain
- Session token failures — Check API key validity and document upload success
- Document ID not found — Handle the nested response structure from the upload endpoint
- Document limit reached — Use the cleanup endpoint to delete old documents from the DWS account
- PDF-to-Excel conversion failures:
- Check Processor API key validity
- Ensure the correct instruction format with
output.type: "xlsx" - Verify PDF URL accessibility
- Server crashes during Excel conversion:
- Issue — Dynamic
FormDataimport causes the server to crash with “Empty reply from server” - Fix — Import
FormDataat the top level instead of using(await import('form-data')).default - Solution — Add
import FormData from 'form-data';to server imports and remove the dynamic import
- Issue — Dynamic
Debugging steps
- Check the browser console for client-side errors
- Monitor server logs for API responses
- Verify container dimensions in browser dev tools
- Test server endpoints directly with curl, PowerShell Invoke-RestMethod, or API testing tools like Postman
- Validate API key permissions in Nutrient DWS API dashboard(opens in a new tab)