Convert to MS Office from any file in C#

Any File to Office

Nutrient .NET SDK (formerly GdPicture.NET) supports converting 100+ file types to Word, Excel, or PowerPoint.

100+ supported input file types

  • MS Office (Word, Excel, PowerPoint)
  • PDF, PDF/A
  • HTML, MHT, MHTML
  • Email (MSG, EML)
  • Images (raster and vector)
  • Text (TXT and RTF) and OpenDocument (ODT)
  • CAD (DXF)
  • RAW Camera Image Formats (3FR, ARW, BAY, etc.)

For more information, refer to the full list of supported file types.

Converting PDF to MS Office

To convert PDF files to MS Office, refer to our separate PDF-to-Word, PDF-to-Excel, and PDF-to-PowerPoint guides.

Converting other file types to MS Office

To save a file to Word, Excel, or PowerPoint format, first use the SaveAsPDF method of the GdPictureDocumentConverter class to convert it to PDF. Then use the SaveAsDOCX method to convert it to a DOCX, the SaveAsXLSX method to convert it to XLSX, or the SaveAsPPTX method to convert it to PPTX.

The SaveAsPDF method uses the following parameters:

  • Stream, or the overload FilePath — A stream object where the current document is saved to as a DOCX file. This stream object must be initialized before it can be sent into this method, and it should stay open for subsequent use. If the output stream isn’t open for both reading and writing, the method will fail, returning the GdPictureStatus.InvalidParameter status, which is the file path where the converted file will be saved. If the specified file already exists, it’ll be overwritten. You have to specify a full file path, including the file extension.
  • Conformance — A member of the PdfConformance enumeration. This specifies the required conformance to the PDF or PDF/A standard of the saved PDF document. You can use the value of PdfConformance.PDF to save the file as a common PDF document.

The SaveAsDOCX, SaveAsXLSX, and SaveAsPPTX methods use the following parameter:

  • Stream, or the overload FilePath

Note that the output stream should be open for both reading and writing and closed/disposed of by the user once processing is complete using the CloseDocument method.

How to convert any file to MS Office

  1. Create a GdPictureDocumentConverter object.
  2. Convert the source file to PDF with GdPictureDocumentConverter.SaveAsPDF(Stream, PdfConformance). Recommended: Specify the source document format with a member of the DocumentFormat enumeration.
  3. Load the newly generated PDF file by passing its path to the LoadFromFile method (this method only supports PDF documents).
  4. Save the PDF file as a DOCX using SaveAsDOCX, as an XLSX using SaveAsXLSX, or as a PPTX using SaveAsPPTX.

The following example converts and saves an RTF document to a DOCX file (it can also be saved as a stream):

using GdPictureDocumentConverter converter = new();
using Stream inputStream = File.Open(@"input.rtf", System.IO.FileMode.Open);
using Stream outputStream = new MemoryStream();
GdPictureStatus status = converter.ConvertToPDF(inputStream, GdPicture14.DocumentFormat.DocumentFormatRTF, outputStream, PdfConformance.PDF1_5);
if (status != GdPictureStatus.OK)
{
throw new Exception(status.ToString());
}
status = converter.LoadFromStream(outputStream);
if (status != GdPictureStatus.OK)
{
throw new Exception(status.ToString());
}
status = converter.SaveAsDOCX("output.docx");
if (status != GdPictureStatus.OK)
{
throw new Exception(status.ToString());
}
Console.WriteLine("The input document has been converted to a docx file");

Optional file type configuration properties

The following file types have optional configuration properties for greater precision:

Optional PDF configuration properties

Optionally, configure the conversion with the following properties of the GdPictureDocumentConverter object:

  • PdfBitonalImageCompression is a member of the PdfCompression enumeration that specifies the compression scheme used for bitonal images in the output PDF file.
  • PdfColorImageCompression is a member of the PdfCompression enumeration that specifies the compression scheme used for color images in the output PDF file.
  • PdfEnableColorDetection is a Boolean value that specifies whether to use automatic color detection during the conversion that preserves image quality and reduces the output file size.
  • PdfEnableLinearization is a Boolean value that specifies whether to linearize the output PDF to enable Fast Web View mode.
  • PdfImageQuality is an integer from 0 to 100 that specifies the image quality in the output PDF file.

The example below creates a PDF document from an RTF file with a custom configuration:

using GdPictureDocumentConverter gdpictureDocumentConverter = new GdPictureDocumentConverter();
// Load the source document.
gdpictureDocumentConverter.LoadFromFile(@"C:\temp\source.rtf", GdPicture14.DocumentFormat.DocumentFormatRTF);
// Configure the conversion.
gdpictureDocumentConverter.PdfColorImageCompression = PdfCompression.PdfCompressionJPEG;
gdpictureDocumentConverter.PdfImageQuality = 50;
// Save the output in a new PDF document.
gdpictureDocumentConverter.SaveAsPDF(@"C:\temp\output.pdf");