Converting PDF to HTML
PDF files preserve fixed layout, but web applications often need HTML content. Converting PDF to HTML helps you publish document content in browser-based systems.
Use this workflow when you need to:
- Publish PDF content on websites or intranets
- Index document content for web search
- Integrate document output into HTML-based pipelines
Project setup
Install:
- The core Nutrient Native SDK package
GdPicture.Resourcesfor HTML conversion support
Prepare the project
Register the SDK license before running conversion operations. For setup details, refer to the getting started with .NET SDK guide.
using GdPicture14;
LicenseManager licence = new LicenseManager();licence.RegisterKEY("");Create the document converter
Instantiate GdPictureDocumentConverter:
using GdPictureDocumentConverter converter = new GdPictureDocumentConverter();Load the PDF document
Load the source PDF file:
converter.LoadFromFile(@"input.pdf");For available converter methods, refer to the API reference guide.
Configure HTML output
Save the document as HTML with page-layout output:
converter.SaveAsHTML(@"output.html", HtmlLayoutType.PageLayout);HtmlLayoutType.PageLayout keeps output close to the original page structure.
Handle errors
LoadFromFile and SaveAsHTML return GdPictureStatus values. Use those values in your error handling flow. For status-handling patterns, refer to the handling errors with .NET SDK guide.
Conclusion
This guide converts a PDF document to HTML for web-based publishing and integration.