How to convert Office to PDF (Word, Excel, and PPT) using Java

Clavin Fernandes

April 1, 2010

How to convert Office to PDF (Word, Excel, and PPT) using Java

Summary

Explore how our Java PDF library enables seamless conversion of Microsoft Office documents to PDF format using a robust web services interface. This comprehensive guide demonstrates how to convert Word, Excel, PowerPoint, and other Office formats to PDF while maintaining document fidelity and formatting. Learn how our scalable SDK supports cross-platform integration while providing granular control over conversion settings and output quality.

This blog walks through Java-based sample code for Nutrient Document Converter Services. A .NET version of this post is available here.

[Nutrient Document Converter Services][dsc] is a server-based SDK that enables software developers to convert typical Office files — including MS Word, Excel, PowerPoint, Visio, Publisher and InfoPath — to PDF format using a robust, scalable web services interface from Java- and .NET-based solutions.

Even though Document Converter Services must run on a Windows-based server, it has been designed to interoperate with non-Windows platforms such as Java. This section describes how to convert documents to PDF format using a Java-based environment.

The full version of the sample code, including pre-generated proxies, is installed alongside each copy of Nutrient Document Converter Services.

The example described below assumes the following:

The JDK has been installed and configured.
Document Converter Services and all prerequisites have been installed in line with the administration guide.
Document Converter Services is running in the default anonymous mode. This isn’t an absolute requirement, but it makes initial experimentation much easier.

The first step is to generate proxy classes for the web service by executing the following command:

wsimport https://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl
-d src -Xnocompile -p com.muhimbi.ws

Feel free to change the package name and destination directory to something more suitable for your organization.

If Document Converter Services isn’t located on the same system as where wsimport is executed, change localhost to the name of the server running the conversion service. You’ll also need to change the host name in the conversion service’s configuration file. A convenient shortcut to the installation folder is located in the start menu. Open Muhimbi.DocumentConverter.Service.exe.config, search for baseAddress, and change the host name. Restart the Document Converter Services to activate the change.

wsimport automatically generates the Java class names. Unfortunately, some of the generated names are rather long and ugly, so you may want to consider renaming some — particularly the exception classes — to something friendlier. This, however, means that if you ever run wsimport again, you’ll need to reapply these changes. For more information, refer to the high-level overview of the object model exposed by the web service.

Once the proxy classes have been created, add the following sample code to your project. Run the code and make sure the path to the document to convert is specified on the command line.

This example sets ConversionSettings.Format to OutputFormat.PDF. As a result, the file is converted to the default PDF format. It’s possible to convert files to other file formats as well by setting this property to a different value. For details, see this blog post.

package com.muhimbi.app;

import com.muhimbi.ws.*;
import java.io.*;
import java.net.URL;
import java.util.List;
import javax.xml.bind.JAXBElement;
import javax.xml.namespace.QName;

public class WsClient {

private final static String DOCUMENTCONVERTERSERVICE_WSDL_LOCATION =
        "https://localhost:41734/Muhimbi.DocumentConverter.WebService/?wsdl";

public static void main(String[] args) {
    try {
      if (args.length != 1) {
        System.out.println("Please specify a single file name on the command line.");
      } else {
        // ** Process command line parameters
        String sourceDocumentPath = args[0];
        File file = new File(sourceDocumentPath);
        String fileName = getFileName(file);
        String fileExt = getFileExtension(file);

        System.out.println("Converting file " + sourceDocumentPath);

        // ** Initialise Web Service
        DocumentConverterService_Service dcss = new DocumentConverterService_Service(
            new URL(DOCUMENTCONVERTERSERVICE_WSDL_LOCATION),
            new QName("https://tempuri.org/", "DocumentConverterService"));
        DocumentConverterService dcs = dcss.getBasicHttpBindingDocumentConverterService();

        // ** Only call conversion if file extension is supported
        if (isFileExtensionSupported(fileExt, dcs)) {
          // ** Read source file from disk
          byte[] fileContent = readFile(sourceDocumentPath);

          // ** Converting the file
          OpenOptions openOptions = getOpenOptions(fileName, fileExt);
          ConversionSettings conversionSettings = getConversionSettings();
          byte[] convertedFile = dcs.convert(fileContent, openOptions, conversionSettings);

          // ** Writing converted file to file system
          String destinationDocumentPath = getPDFDocumentPath(file);
          writeFile(convertedFile, destinationDocumentPath);
          System.out.println("File converted sucessfully to " + destinationDocumentPath);

        } else {
          System.out.println("The file extension is not supported.");
        }
      }

    } catch (IOException e) {
      System.out.println(e.getMessage());
    } catch (DocumentConverterServiceGetConfigurationWebServiceFaultExceptionFaultFaultMessage e) {
      printException(e.getFaultInfo());
    } catch (DocumentConverterServiceConvertWebServiceFaultExceptionFaultFaultMessage e) {
      printException(e.getFaultInfo());
    }
}

public static OpenOptions getOpenOptions(String fileName, String fileExtension) {
    ObjectFactory objectFactory = new ObjectFactory();
    OpenOptions openOptions = new OpenOptions();
    openOptions.setOriginalFileName(objectFactory.createOpenOptionsOriginalFileName(fileName));
    openOptions.setFileExtension(objectFactory.createOpenOptionsFileExtension(fileExtension));
    return openOptions;
}

public static ConversionSettings getConversionSettings() {
    ConversionSettings conversionSettings = new ConversionSettings();
    conversionSettings.setQuality(ConversionQuality.OPTIMIZE_FOR_PRINT);
    conversionSettings.setRange(ConversionRange.ALL_DOCUMENTS);
    conversionSettings.getFidelity().add("Full");
    conversionSettings.setFormat(OutputFormat.PDF);
    return conversionSettings;
}

public static String getFileName(File file) {
    String fileName = file.getName();
    return fileName.substring(0, fileName.lastIndexOf('.'));
}

public static String getFileExtension(File file) {
    String fileName = file.getName();
    return fileName.substring(fileName.lastIndexOf('.') + 1, fileName.length());
}

public static String getPDFDocumentPath(File file) {
    String fileName = getFileName(file);
    String folder = file.getParent();
    if (folder == null) {
      folder = new File(file.getAbsolutePath()).getParent();
    }
    return folder + File.separatorChar + fileName + '.' + OutputFormat.PDF.value();
}

public static byte[] readFile(String filepath) throws IOException {
    File file = new File(filepath);
    InputStream is = new FileInputStream(file);
    long length = file.length();
    byte[] bytes = new byte[(int) length];

    int offset = 0;
    int numRead;
    while (offset < bytes.length && (numRead = is.read(bytes, offset, bytes.length - offset)) >= 0) {
      offset += numRead;
    }

    if (offset < bytes.length) {
      throw new IOException("Could not completely read file " + file.getName());
    }
    is.close();
    return bytes;
}

public static void writeFile(byte[] fileContent, String filepath) throws IOException {
    OutputStream os = new FileOutputStream(filepath);
    os.write(fileContent);
    os.close();
}

public static boolean isFileExtensionSupported(String extension, DocumentConverterService dcs)
    throws DocumentConverterServiceGetConfigurationWebServiceFaultExceptionFaultFaultMessage
    {
      Configuration configuration = dcs.getConfiguration();
      final JAXBElement<ArrayOfConverterConfiguration> converters = configuration.getConverters();
      final ArrayOfConverterConfiguration ofConverterConfiguration = converters.getValue();
      final List<ConverterConfiguration> cList = ofConverterConfiguration.getConverterConfiguration();

      for (ConverterConfiguration cc : cList) {
        final List<String> supportedExtension = cc.getSupportedFileExtensions().getValue().getString();
        if (supportedExtension.contains(extension)) {
          return true;
        }
    }

    return false;
}

public static void printException(WebServiceFaultException serviceFaultException) {
    System.out.println(serviceFaultException.getExceptionType());
    JAXBElement<ArrayOfstring> element = serviceFaultException.getExceptionDetails();
    ArrayOfstring value = element.getValue();
    for (String msg : value.getString()) {
      System.out.println(msg);
    }
}

}

Conclusion

Converting Office documents to PDF using the Nutrient Document Converter Services in a Java environment is both powerful and straightforward. By generating web service proxies with wsimport, configuring your environment, and utilizing the provided sample code, you can efficiently integrate robust document conversion capabilities into you applications, without being tied to a Windows-based development stack. Whether you’re building enterprise solutions or automating internal workflows, Document Converter Services offers a reliable, scalable way to support cross-platform document conversion needs. For more advanced scenarios and additional output formats, be sure to explore the rest of our documentation and blog posts.

Explore related topics

Low-Code

Conclusion

Explore related topics

Related Low-Code articles

Introducing smarter document automation — built directly into the tools your teams already use

How to scale document workflows in compliance with regulations

Digital transformation is failing without intelligent document automation