GdPicture.NET SDK includes the ability to convert any supported file type into Word, Excel, or PowerPoint. This technology applies a unique hybrid adaptive approach that includes heuristics, mathematics, and machine learning.
Nutrient SDKs are deployed in some of the world’s most popular applications, such as those made by Autodesk, Disney, UBS, Dropbox, IBM, and Lufthansa.
Key Capabilities
-
100+ input file types — PDF, HTML, images, more
-
Convert to MS Office — Word, Excel, or PowerPoint
-
Works offline — Without internet access
-
Add to any application — Web, desktop, and server
-
Merge to Office — Merge multiple files into an Office file
-
Comprehensive PDF-to-Office SDK — For seamless conversion of PDF files to Word, Excel, and PowerPoint
Guides for Conversion to Office
Convert from PDF to Word
How to convert to Word (DOCX) from PDF
Convert from PDF to Excel
How to convert to Excel (XLSX) from PDF
Convert from PDF to PowerPoint
How to convert to PowerPoint (PPTX) from PDF
Convert from HTML to Word
How to convert to Word (DOCX) from HTML
Convert from RTF to Word
How to convert to Word (DOCX) from RTF
Convert from Any File to MS Office
How to convert to Word, Excel, or PowerPoint from any supported file type
Our PDF-to-Office SDK ensures high-quality conversion from PDF to Word, Excel, and PowerPoint.
100+ Supported Input File Types
-
MS Office (Word, Excel, PowerPoint)
-
PDF, PDF/A
-
HTML, MHT, MHTML
-
Email (MSG, EML)
-
Images (raster and vector)
-
Text (TXT and RTF) and OpenDocument (ODT)
-
CAD (DXF)
-
RAW Camera Image Formats (3FR, ARW, BAY, etc.)
For more information, refer to the full list of supported file types.
GdPicture.NET SDK includes the ability to convert any supported file type into Word.
To save a PDF to a Word document (DOCX), use the SaveAsDOCX
method method of the GdPictureDocumentConverter
class. It uses the following parameter:
-
Stream
, or the overloadFilePath
— A stream object where the current document is saved as a DOCX file. This stream object must be initialized before it can be sent into this method, and it should stay open for subsequent use. If the output stream isn’t open for both reading and writing, the method will fail, returning theGdPictureStatus.InvalidParameter
status, which is the file path where the converted file will be saved. If the specified file already exists, it’ll be overwritten. You have to specify a full file path, including the file extension.
Note that the output stream should be open for both reading and writing and closed/disposed of by the user once processing is complete using the
CloseDocument
method.
How to Convert PDF to DOCX
-
Create a
GdPictureDocumentConverter
object. -
Load the source document by passing its path to the
LoadFromFile
method. This method accepts all supported file formats. However, only PDF will return a high-quality DOCX. If the source document isn’t a PDF,saveAsDOCX
will return a DOCX, with each page containing a bitmap image representing the input document. If the source document isn’t a PDF, files can be converted to PDF withGdPictureDocumentConverter.SaveAsPDF
and then passed to thesaveAsDOCX
method. Recommended: Specify the source document format with a member of theDocumentFormat
enumeration. -
Save the PDF file as a DOCX using
SaveAsDOCX
.
If you use
SaveAsDOCX
after loading a file that isn’t a PDF, the method will create a DOCX containing the original document as an image. Instead, for the best results, ensure the input document is a PDF.
The following example converts and saves a PDF document to a DOCX file (it can also be saved as a stream):
using GdPictureDocumentConverter converter = new(); GdPictureStatus status = converter.LoadFromFile("input.pdf"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.SaveAsDOCX("output.docx"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } Console.WriteLine("The input document has been converted to a docx file");
Using gdpictureDocumentConverter As New GdPictureDocumentConverter() Dim status As GdPictureStatus = gdpictureDocumentConverter.LoadFromFile("input.pdf", GdPicture14.DocumentFormat.DocumentFormatPDF) If status = GdPictureStatus.OK Then gdpictureDocumentConverter.DocxImageQuality = 80 status = gdpictureDocumentConverter.SaveAsDOCX("output.docx") If status = GdPictureStatus.OK Then MessageBox.Show("The file has been saved successfully.", "GdPicture") Else MessageBox.Show("The file has failed to save. Status: " + status.ToString(), "GdPicture") End If Else MessageBox.Show("The file has failed to load. Status: " + status.ToString(), "GdPicture") End If End Using
See Also
Related Topics
GdPicture.NET’s table extraction engine is a native SDK that enables you to recognize tables in an unstructured document or image, parse the information, and export the tables to an external destination like a spreadsheet. It can detect and extract bordered, semi-bordered, and borderless tables in images, scanned PDFs, and digitally born PDFs. As a native SDK, it can be deployed on-premises or embedded in your application, and it works offline, without internet access.
There are two possible approaches to converting PDFs to Excel with GdPicture.NET:
-
Convert all contents in a PDF document to Excel.
-
Recognize and extract only the tables present in a document to Excel
Both of these options are explained below.
Converting the Entire PDF Document to Excel
To save all contents of a PDF document to an Excel spreadsheet (XLSX), use the SaveAsXLSX
method method of the GdPictureDocumentConverter
class. It uses the following parameter:
-
Stream
, or the overloadFilePath
— A stream object where the current document is saved as an XLSX file. This stream object must be initialized before it can be sent into this method, and it should stay open for subsequent use. If the output stream isn’t open for both reading and writing, the method will fail, returning theGdPictureStatus.InvalidParameter
status, which is the file path where the converted file will be saved. If the specified file already exists, it’ll be overwritten. You have to specify a full file path, including the file extension.
Note that the output stream should be open for both reading and writing and closed/disposed of by the user once processing is complete using the
CloseDocument
method.
Here’s how to convert PDF to XLSX:
-
Create a
GdPictureDocumentConverter
object. -
Load the source document by passing its path to the
LoadFromFile
method. This method accepts all supported file formats. However, only a PDF file can be converted into an XLSX (other input file formats will returnGdPictureStatus.NotImplemented
). If the source document isn’t a PDF, it can be converted to PDF first withGdPictureDocumentConverter.SaveAsPDF
. Recommended: Specify the source document format with a member of theDocumentFormat
enumeration. -
Save the PDF file as an XLSX using
SaveAsXLSX
.
The following example converts and saves all content in a PDF document to an XLSX file (it can also be saved as a stream):
using GdPictureDocumentConverter converter = new(); var status = converter.LoadFromFile("input.pdf"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.SaveAsXLSX("output.xlsx"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } Console.WriteLine("The input document has been converted to a xlsx file");
Related Topics
Recognizing and Extracting Table Data from a PDF to an Excel Spreadsheet
To identify all bordered, semi-bordered, and borderless tables in a PDF and then extract only the tables to an Excel spreadsheet, follow these steps:
The following approach uses the
gdpictureOCR.SaveAsXLSX
method, which will only extract the tables present in the document.
To read and extract table data from a PDF document to an Excel spreadsheet, follow these steps:
-
Create a
GdPictureOCR
object and aGdPicturePDF
object. -
Select the source document by passing its path to the
LoadFromFile
method of theGdPicturePDF
object. -
Select the page from which to extract the table data with the
SelectPage
method of theGdPicturePDF
object. -
Render the selected page to a 300 dots-per-inch (DPI) image with the
RenderPageToGdPictureImageEx
method of theGdPicturePDF
object. -
Pass the image to the
GdPictureOCR
object with theSetImage
method. -
Configure the table extraction process with the
GdPictureOCR
object in the following way:-
Set the path to the OCR resource folder with the
ResourceFolder
property. The default language resources are located inGdPicture.NET 14\Redist\OCR
. For more information on adding language resources, see the language support guide. -
With the
AddLanguage
method, add the language resources that GdPicture.NET uses to recognize text in the image. This method takes a member of theOCRLanguage
enumeration.
For more optional configuration parameters, see the
GdPictureOCR
class. -
-
Run the table extraction process with the
RunOCR
method of theGdPictureOCR
object, and save the result ID in a list. -
Create a
GdPictureOCR.SpreadsheetOptions
object and configure the output spreadsheet. By default, tables from the same OCR result are saved in the same sheet. To save each table in a different sheet, set theSeparateTables
property of theGdPictureOCR.SpreadsheetOptions
object totrue
. For more optional configuration parameters, see theGdPictureOCR.SpreadsheetOptions
class. -
Save the output in an Excel spreadsheet with the
SaveAsXLSX
method of theGdPictureOCR
object. This method takes the following parameters:-
The list containing the OCR result ID.
-
The path to the output file.
-
The
GdPictureOCR.SpreadsheetOptions
object.
-
-
Release unnecessary resources.
The example below extracts table data from the first page of a document and saves the output in an Excel spreadsheet:
using GdPictureOCR gdpictureOCR = new GdPictureOCR(); using GdPicturePDF gdpicturePDF = new GdPicturePDF(); // Load the source document. gdpicturePDF.LoadFromFile(@"C:\temp\source.pdf"); // Select the first page. gdpicturePDF.SelectPage(1); // Render the first page to a 300 DPI image. int imageId = gdpicturePDF.RenderPageToGdPictureImageEx(300, true); // Pass the image to the `GdPictureOCR` object. gdpictureOCR.SetImage(imageId); // Configure the table extraction process. gdpictureOCR.ResourceFolder = @"C:\GdPicture.NET 14\Redist\OCR"; gdpictureOCR.AddLanguage(OCRLanguage.English); // Run the table extraction process and save the result ID in a list. string result = gdpictureOCR.RunOCR(); List<string> resultsList = new List<string>() { result }; // Configure the output spreadsheet. GdPictureOCR.SpreadsheetOptions spreadsheetOptions = new GdPictureOCR.SpreadsheetOptions() { SeparateTables = true }; // Save the output in an Excel spreadsheet. gdpictureOCR.SaveAsXLSX(resultsList, @"C:\temp\output.xlsx", spreadsheetOptions); // Release unnecessary resources. gdpictureOCR.ReleaseOCRResults(); GdPictureDocumentUtilities.DisposeImage(imageId); gdpicturePDF.CloseDocument();
Using gdpictureOCR As GdPictureOCR = New GdPictureOCR() Using gdpicturePDF As GdPicturePDF = New GdPicturePDF() ' Load the source document. gdpicturePDF.LoadFromFile("C:\temp\source.pdf") ' Select the first page. gdpicturePDF.SelectPage(1) ' Render the first page to a 300 DPI image. Dim imageId As Integer = gdpicturePDF.RenderPageToGdPictureImageEx(300, True) ' Pass the image to the `GdPictureOCR` object. gdpictureOCR.SetImage(imageId) ' Configure the table extraction process. gdpictureOCR.ResourceFolder = "C:\GdPicture.NET 14\Redist\OCR" gdpictureOCR.AddLanguage(OCRLanguage.English) ' Run the table extraction process and save the result ID in a list. Dim result As String = gdpictureOCR.RunOCR() Dim resultsList As List(Of String) = New List(Of String)() resultsList.Add(result) ' Configure the output spreadsheet. Dim spreadsheetOptions As gdpictureOCR.SpreadsheetOptions = New GdPictureOCR.SpreadsheetOptions() With { .SeparateTables = True } ' Save the output in an Excel spreadsheet. gdpictureOCR.SaveAsXLSX(resultsList, "C:\temp\output.xlsx", spreadsheetOptions) ' Release unnecessary resources. gdpictureOCR.ReleaseOCRResults() GdPictureDocumentUtilities.DisposeImage(imageId) gdpicturePDF.CloseDocument() End Using End Using
Used Methods and Properties
Related Topics
For more information on extracting table data from PDFs, refer to the table extraction guide.
GdPicture.NET SDK includes the ability to convert any supported file type into PowerPoint.
To save a PDF to a PowerPoint presentation (PPTX), use the SaveAsPPTX
method of the GdPictureDocumentConverter
class. It uses the following parameter:
-
Stream
, or the overloadFilePath
— A stream object where the current document is saved as a PPTX file. This stream object must be initialized before it can be sent into this method, and it should stay open for subsequent use. If the output stream isn’t open for both reading and writing, the method will fail, returning theGdPictureStatus.InvalidParameter
status, which is the file path where the converted file will be saved. If the specified file already exists, it’ll be overwritten. You have to specify a full file path, including the file extension.
Note that the output stream should be open for both reading and writing and closed/disposed of by the user once processing is complete using the
CloseDocument
method.
How to Convert PDF to PPTX
-
Create a
GdPictureDocumentConverter
object. -
Load the source document by passing its path to the
LoadFromFile
method. This method accepts all supported file formats. However, only a PDF file can be converted into a PPTX (other input file formats will returnGdPictureStatus.NotImplemented
). If the source document isn’t a PDF, it can be converted to PDF first withGdPictureDocumentConverter.SaveAsPDF
. Recommended: Specify the source document format with a member of theDocumentFormat
enumeration. -
Save the PDF file as a PPTX using
SaveAsPPTX
.
The following example converts and saves a PDF document to a PPTX file (it can also be saved as a stream):
using GdPictureDocumentConverter converter = new(); GdPictureStatus status = converter.LoadFromFile("input.pdf"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.SaveAsPPTX("output.pptx"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } Console.WriteLine("The input document has been converted to a pptx file");
Related Topics
GdPicture.NET SDK includes the ability to convert any supported file type into Word.
To save an HTML file to a Word document (DOCX), first use the SaveAsPDF
method of the GdPictureDocumentConverter
class to convert it to PDF. Then use the SaveAsDOCX
method to convert it to a DOCX.
The SaveAsPDF
method uses the following parameters:
-
Stream
, or the overloadFilePath
— A stream object where the current document is saved as a DOCX file. This stream object must be initialized before it can be sent into this method, and it should stay open for subsequent use. If the output stream isn’t open for both reading and writing, the method will fail, returning theGdPictureStatus.InvalidParameter
status, which is the file path where the converted file will be saved. If the specified file already exists, it’ll be overwritten. You have to specify a full file path, including the file extension. -
Conformance
— A member of thePdfConformance
enumeration. This specifies the required conformance to the PDF or PDF/A standard of the saved PDF document. You can use the value ofPdfConformance.PDF
to save the file as a common PDF document.
The SaveAsDOCX
method uses the following parameters:
-
Stream
, or the overloadFilePath
Note that the output stream should be open for both reading and writing and closed/disposed of by the user once processing is complete using the
CloseDocument
method.
How to Convert HTML to DOCX
-
Create a
GdPictureDocumentConverter
object. -
Convert the source HTML file to PDF with
GdPictureDocumentConverter.SaveAsPDF(Stream, PdfConformance)
. Recommended: Specify the source document format with a member of theDocumentFormat
enumeration. -
Load the newly generated PDF file by passing its path to the
LoadFromFile
method (this method only supports PDF documents). -
Save the PDF file as a DOCX using
SaveAsDOCX
.
The following example converts and saves an HTML document to a DOCX file (it can also be saved as a stream):
using GdPictureDocumentConverter converter = new(); // Set the text and document properties to be used for the resulting file. converter.HtmlPageHeight = 842; // A3 page size converter.HtmlPageWidth = 1191; // A3 page size converter.HtmlPageMarginTop = 10; converter.HtmlPageMarginBottom = 10; converter.HtmlPageMarginLeft = 10; converter.HtmlPageMarginRight = 10; using Stream inputStream = File.Open(@"input.html", System.IO.FileMode.Open); using Stream outputStream = new MemoryStream(); GdPictureStatus status = converter.ConvertToPDF(inputStream, GdPicture14.DocumentFormat.DocumentFormatHTML, outputStream, PdfConformance.PDF1_5); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.LoadFromStream(outputStream); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.SaveAsDOCX("output.docx"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } Console.WriteLine("The input document has been converted to a docx file");
See Also
DocxImageQuality Property
GdPictureDocumentConverter Class
GdPictureDocumentConverter Members
CloseDocument Method
RasterizationDPI Property
HtmlEmulationType Property
HtmlPageHeight Property
HtmlPageMarginBottom Property
HtmlPageMarginLeft Property
HtmlPageMarginRight Property
HtmlPageMarginTop Property
HtmlPageWidth Property
HtmlPreferCSSPageSize Property
HtmlPreferOnePage Property
Related Topics
Optional HTML Configuration Properties
Optionally, configure the conversion with the following properties of the GdPictureDocumentConverter
object:
Optional PDF Configuration Properties
Optionally, configure the conversion with the following properties of the GdPictureDocumentConverter
object:
-
PdfBitonalImageCompression
is a member of thePdfCompression
enumeration that specifies the compression scheme used for bitonal images in the output PDF file. -
PdfColorImageCompression
is a member of thePdfCompression
enumeration that specifies the compression scheme used for color images in the output PDF file. -
PdfEnableColorDetection
is a Boolean value that specifies whether to use automatic color detection during the conversion that preserves image quality and reduces the output file size. -
PdfEnableLinearization
is a Boolean value that specifies whether to linearize the output PDF to enable Fast Web View mode. -
PdfImageQuality
is an integer from 0 to 100 that specifies the image quality in the output PDF file.
The example below creates a PDF document from an HTML file with a custom configuration:
using GdPictureDocumentConverter gdpictureDocumentConverter = new GdPictureDocumentConverter(); // Load the source document. gdpictureDocumentConverter.LoadFromFile(@"C:\temp\source.html", GdPicture14.DocumentFormat.DocumentFormatHTML); // Configure the conversion. gdpictureDocumentConverter.PdfColorImageCompression = PdfCompression.PdfCompressionJPEG; gdpictureDocumentConverter.PdfImageQuality = 50; // Save the output in a new PDF document. gdpictureDocumentConverter.SaveAsPDF(@"C:\temp\output.pdf");
Using gdpictureDocumentConverter As GdPictureDocumentConverter = New GdPictureDocumentConverter() ' Load the source document. gdpictureDocumentConverter.LoadFromFile("C:\temp\source.html", GdPicture14.DocumentFormat.DocumentFormatHTML); ' Configure the conversion. gdpictureDocumentConverter.PdfColorImageCompression = PdfCompression.PdfCompressionJPEG gdpictureDocumentConverter.PdfImageQuality = 50 ' Save the output in a new PDF document. gdpictureDocumentConverter.SaveAsPDF("C:\temp\output.pdf") End Using
Used Methods and Properties
Related Topics
GdPicture.NET SDK includes the ability to convert any supported file type into Word.
To save an RTF document to a Word document (DOCX), first use the SaveAsPDF
method of the GdPictureDocumentConverter
class to convert it to PDF. Then use the SaveAsDOCX
method to convert it to a DOCX.
The SaveAsPDF
method uses the following parameter:
-
Stream
, or the overloadFilePath
— A stream object where the current document is saved as a DOCX file. This stream object must be initialized before it can be sent into this method, and it should stay open for subsequent use. If the output stream isn’t open for both reading and writing, the method will fail, returning theGdPictureStatus.InvalidParameter
status, which is the file path where the converted file will be saved. If the specified file already exists, it’ll be overwritten. You have to specify a full file path, including the file extension.
The SaveAsDOCX
method uses the following parameter:
-
Stream
, or the overloadFilePath
Note that the output stream should be open for both reading and writing and closed/disposed of by the user once processing is complete using the
CloseDocument
method.
How to Convert RTF to DOCX
-
Create a
GdPictureDocumentConverter
object. -
Convert the source RTF file to PDF with
GdPictureDocumentConverter.SaveAsPDF(Stream, PdfConformance)
. Recommended: Specify the source document format with a member of theDocumentFormat
enumeration. -
Load the newly generated PDF file by passing its path to the
LoadFromFile
method method (this method only supports PDF documents). -
Save the PDF file as a DOCX using
SaveAsDOCX
.
The following example converts and saves a RTF document to a DOCX file (it can also be saved as a stream):
using GdPictureDocumentConverter converter = new(); using Stream inputStream = File.Open(@"input.rtf", System.IO.FileMode.Open); using Stream outputStream = new MemoryStream(); GdPictureStatus status = converter.ConvertToPDF(inputStream, GdPicture14.DocumentFormat.DocumentFormatRTF, outputStream, PdfConformance.PDF1_5); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.LoadFromStream(outputStream); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.SaveAsDOCX("output.docx"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } Console.WriteLine("The input document has been converted to a docx file");
See Also
Related Topics
Optional PDF Configuration Properties
Optionally, configure the conversion with the following properties of the GdPictureDocumentConverter
object:
-
PdfBitonalImageCompression
is a member of thePdfCompression
enumeration that specifies the compression scheme used for bitonal images in the output PDF file. -
PdfColorImageCompression
is a member of thePdfCompression
enumeration that specifies the compression scheme used for color images in the output PDF file. -
PdfEnableColorDetection
is a Boolean value that specifies whether to use automatic color detection during the conversion that preserves image quality and reduces the output file size. -
PdfEnableLinearization
is a Boolean value that specifies whether to linearize the output PDF to enable Fast Web View mode. -
PdfImageQuality
is an integer from 0 to 100 that specifies the image quality in the output PDF file.
The example below creates a PDF document from an RTF file with a custom configuration:
using GdPictureDocumentConverter gdpictureDocumentConverter = new GdPictureDocumentConverter(); // Load the source document. gdpictureDocumentConverter.LoadFromFile(@"C:\temp\source.rtf", GdPicture14.DocumentFormat.DocumentFormatRTF); // Configure the conversion. gdpictureDocumentConverter.PdfColorImageCompression = PdfCompression.PdfCompressionJPEG; gdpictureDocumentConverter.PdfImageQuality = 50; // Save the output in a new PDF document. gdpictureDocumentConverter.SaveAsPDF(@"C:\temp\output.pdf");
Using gdpictureDocumentConverter As GdPictureDocumentConverter = New GdPictureDocumentConverter() ' Load the source document. gdpictureDocumentConverter.LoadFromFile("C:\temp\source.rtf", GdPicture14.DocumentFormat.DocumentFormatRTF); ' Configure the conversion. gdpictureDocumentConverter.PdfColorImageCompression = PdfCompression.PdfCompressionJPEG gdpictureDocumentConverter.PdfImageQuality = 50 ' Save the output in a new PDF document. gdpictureDocumentConverter.SaveAsPDF("C:\temp\output.pdf") End Using
Used Methods and Properties
Related Topics
GdPicture.NET supports converting 100+ file types to Word, Excel, or PowerPoint.
100+ Supported Input File Types
-
MS Office (Word, Excel, PowerPoint)
-
PDF, PDF/A
-
HTML, MHT, MHTML
-
Email (MSG, EML)
-
Images (raster and vector)
-
Text (TXT and RTF) and OpenDocument (ODT)
-
CAD (DXF)
-
RAW Camera Image Formats (3FR, ARW, BAY, etc.)
For more information, refer to the full list of supported file types.
Converting PDF to MS Office
To convert PDF files to MS Office, refer to our separate PDF-to-Word, PDF-to-Excel, and PDF-to-PowerPoint guides.
Converting Other File Types to MS Office
To save a file to Word, Excel, or PowerPoint format, first use the SaveAsPDF
method of the GdPictureDocumentConverter
class to convert it to PDF. Then use the SaveAsDOCX
method to convert it to a DOCX, the SaveAsXLSX
method to convert it to XLSX, or the SaveAsPPTX
method to convert it to PPTX.
The SaveAsPDF
method uses the following parameters:
-
Stream
, or the overloadFilePath
— A stream object where the current document is saved to as a DOCX file. This stream object must be initialized before it can be sent into this method, and it should stay open for subsequent use. If the output stream isn’t open for both reading and writing, the method will fail, returning theGdPictureStatus.InvalidParameter
status, which is the file path where the converted file will be saved. If the specified file already exists, it’ll be overwritten. You have to specify a full file path, including the file extension. -
Conformance
— A member of thePdfConformance
enumeration. This specifies the required conformance to the PDF or PDF/A standard of the saved PDF document. You can use the value ofPdfConformance.PDF
to save the file as a common PDF document.
The SaveAsDOCX
, SaveAsXLSX
, and SaveAsPPTX
methods use the following parameter:
-
Stream
, or the overloadFilePath
Note that the output stream should be open for both reading and writing and closed/disposed of by the user once processing is complete using the
CloseDocument
method.
How to Convert Any File to MS Office
-
Create a
GdPictureDocumentConverter
object. -
Convert the source file to PDF with
GdPictureDocumentConverter.SaveAsPDF(Stream, PdfConformance)
. Recommended: Specify the source document format with a member of theDocumentFormat
enumeration. -
Load the newly generated PDF file by passing its path to the
LoadFromFile
method (this method only supports PDF documents). -
Save the PDF file as a DOCX using
SaveAsDOCX
, as an XLSX usingSaveAsXLSX
, or as a PPTX usingSaveAsPPTX
.
The following example converts and saves an RTF document to a DOCX file (it can also be saved as a stream):
using GdPictureDocumentConverter converter = new(); using Stream inputStream = File.Open(@"input.rtf", System.IO.FileMode.Open); using Stream outputStream = new MemoryStream(); GdPictureStatus status = converter.ConvertToPDF(inputStream, GdPicture14.DocumentFormat.DocumentFormatRTF, outputStream, PdfConformance.PDF1_5); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.LoadFromStream(outputStream); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } status = converter.SaveAsDOCX("output.docx"); if (status != GdPictureStatus.OK) { throw new Exception(status.ToString()); } Console.WriteLine("The input document has been converted to a docx file");
See Also
Related Topics
Optional File Type Configuration Properties
The following file types have optional configuration properties for greater precision:
Optional PDF Configuration Properties
Optionally, configure the conversion with the following properties of the GdPictureDocumentConverter
object:
-
PdfBitonalImageCompression
is a member of thePdfCompression
enumeration that specifies the compression scheme used for bitonal images in the output PDF file. -
PdfColorImageCompression
is a member of thePdfCompression
enumeration that specifies the compression scheme used for color images in the output PDF file. -
PdfEnableColorDetection
is a Boolean value that specifies whether to use automatic color detection during the conversion that preserves image quality and reduces the output file size. -
PdfEnableLinearization
is a Boolean value that specifies whether to linearize the output PDF to enable Fast Web View mode. -
PdfImageQuality
is an integer from 0 to 100 that specifies the image quality in the output PDF file.
The example below creates a PDF document from an RTF file with a custom configuration:
using GdPictureDocumentConverter gdpictureDocumentConverter = new GdPictureDocumentConverter(); // Load the source document. gdpictureDocumentConverter.LoadFromFile(@"C:\temp\source.rtf", GdPicture14.DocumentFormat.DocumentFormatRTF); // Configure the conversion. gdpictureDocumentConverter.PdfColorImageCompression = PdfCompression.PdfCompressionJPEG; gdpictureDocumentConverter.PdfImageQuality = 50; // Save the output in a new PDF document. gdpictureDocumentConverter.SaveAsPDF(@"C:\temp\output.pdf");
Using gdpictureDocumentConverter As GdPictureDocumentConverter = New GdPictureDocumentConverter() ' Load the source document. gdpictureDocumentConverter.LoadFromFile("C:\temp\source.rtf", GdPicture14.DocumentFormat.DocumentFormatRTF); ' Configure the conversion. gdpictureDocumentConverter.PdfColorImageCompression = PdfCompression.PdfCompressionJPEG gdpictureDocumentConverter.PdfImageQuality = 50 ' Save the output in a new PDF document. gdpictureDocumentConverter.SaveAsPDF("C:\temp\output.pdf") End Using