Image and PDF compression in C#
Nutrient .NET SDK (formerly GdPicture.NET) enables you to dramatically reduce the file size of PDF documents, with a focus on font optimization, data compression, and image analysis.
PDF optimization involves serializing several compression algorithms to go beyond the limitations of some compression schemes. It also involves removing unwanted or unused objects in a PDF.
To compress a PDF document, follow these steps:
- Create a
GdPicturePDFReducer
object. - Configure the metadata of the resulting PDF document with the following properties of the
PDFReducerConfiguration
object:
Property name | Description |
---|---|
Author | Specifies the author of the resulting PDF document. |
Producer | Specifies the producer of the resulting PDF document. |
ProducerName | Specifies the name of the producer of the resulting PDF document. |
Title | Specifies the title of the resulting PDF document. |
- Configure the compression process with the following properties of the
PDFReducerConfiguration
object:
Property name | Description |
---|---|
DownscaleImages | Specifies whether to downscale images. The default value is true . |
DownscaleResolution | Specifies the resolution to downscale images. The default value is 150 . |
DownscaleResolutionMRC | Specifies the resolution for downscaling the background layer by the mixed raster content (MRC) engine. The default value is 100 . |
EnableCharRepair | Specifies whether to perform character repair during bitonal conversion. The default value is false . |
EnableColorDetection | Specifies whether to perform color detection on images. The default value is true . |
EnableJBIG2 | Specifies whether to use the JBIG2 compression scheme to compress bitonal images. The default value is true . |
EnableJPEG2000 | Specifies whether to use the JPEG2000 compression scheme to compress the images. The default value is true . |
EnableMRC | Specifies whether to use MRC for compressing the content of the source PDF. The default value is false . |
EnableParallelization | Specifies whether to use multiple cores to speed up the process. Threads are dynamically allocated based on the real-time available CPU resources. The default value is true . |
FastWebView | Specifies whether to optimize the PDF for online distribution (linearized PDF). The default value is false . |
ImageQuality | Specifies the quality of the compressed images. The default value is PDFReducerImageQuality.ImageQualityMedium . |
JBIG2PMSThreshold | Specifies the threshold value for the JBIG2 encoder pattern matching and substitution between 0 and 1 . Any number lower than 1 may lead to lossy compression. The default value is 0.75 . |
MaxBitmapPerPage | Specifies the maximum number of bitmap images per page. |
OutputFormat | A member of the PDFReducerPDFVersion enumeration that specifies the version and the conformance level of the output PDF document. The default value is PDFReducerPDFVersion.PdfVersion15 . |
PackDocument | Specifies whether to pack the PDF to reduce its size. The default value is true . |
PackFonts | Specifies whether to pack the PDF fonts to reduce their size. The default value is true . |
PreserveSmoothing | Specifies whether the MRC engine preserves smoothing between different layers. The default value is true . |
RecompressImages | Specifies whether to recompress the images. The default value is true . |
RemoveAnnotations | Specifies whether to remove annotations. The default value is false . |
RemoveBlankPages | Specifies whether to remove blank pages. The default value is false . |
RemoveBookmarks | Specifies whether to remove bookmarks. The default value is false . |
RemoveEmbeddedFiles | Specifies whether to remove embedded files. The default value is false . |
RemoveFormFields | Specifies whether to remove form fields. The default value is false . |
RemoveHyperlinks | Specifies whether to remove hyperlinks. The default value is false . |
RemoveJavaScript | Specifies whether to remove JavaScript. The default value is false . |
RemoveMetadata | Specifies whether to remove metadata. The default value is false . |
RemovePagePieceInfo | Specifies whether to remove the page PieceInfo dictionary used to hold private application data. The default value is true . |
RemovePageThumbnails | Specifies whether to remove page thumbnails. The default value is false . |
UnembedFonts | Specifies whether to remove embedded font data. The default value is false . |
- Run the compression process with the
ProcessDocument
method. This method takes the path to the source and the output PDF files as its parameters.
General optimization of PDF documents
The example below focuses on general aspects of PDF optimization such as content removal and font optimization:
using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer();// Configure the metadata of the resulting PDF document.gdpicturePDFReducer.PDFReducerConfiguration.Author = "Nutrient .NET PDF Reducer SDK";gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14";gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "Nutrient";gdpicturePDFReducer.PDFReducerConfiguration.Title = "PDF Optimization";
// Specify the version and the conformance level of the output PDF document.gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting;
// Configure the compression process by removing document elements.gdpicturePDFReducer.PDFReducerConfiguration.RemoveAnnotations = true;gdpicturePDFReducer.PDFReducerConfiguration.RemoveBlankPages = true;gdpicturePDFReducer.PDFReducerConfiguration.RemoveBookmarks = true;gdpicturePDFReducer.PDFReducerConfiguration.RemoveEmbeddedFiles = true;gdpicturePDFReducer.PDFReducerConfiguration.RemoveFormFields = true;gdpicturePDFReducer.PDFReducerConfiguration.RemoveHyperlinks = true;gdpicturePDFReducer.PDFReducerConfiguration.RemoveJavaScript = true;gdpicturePDFReducer.PDFReducerConfiguration.RemoveMetadata = true;gdpicturePDFReducer.PDFReducerConfiguration.RemovePageThumbnails = true;
// Optimize the output file size by packing fonts.gdpicturePDFReducer.PDFReducerConfiguration.PackFonts = true;
// Optimize the output file size by packing the document.gdpicturePDFReducer.PDFReducerConfiguration.PackDocument = true;
// Run the compression process.gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer() 'Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "Nutrient .NET PDF Reducer SDK" gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14" gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "Nutrient" gdpicturePDFReducer.PDFReducerConfiguration.Title = "PDF Optimization"
'Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting
'Configure the compression process by removing document elements. gdpicturePDFReducer.PDFReducerConfiguration.RemoveAnnotations = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveBlankPages = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveBookmarks = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveEmbeddedFiles = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveFormFields = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveHyperlinks = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveJavaScript = True gdpicturePDFReducer.PDFReducerConfiguration.RemoveMetadata = True gdpicturePDFReducer.PDFReducerConfiguration.RemovePageThumbnails = True
'Optimize the output file size by packing fonts. gdpicturePDFReducer.PDFReducerConfiguration.PackFonts = True
'Optimize the output file size by packing the document. gdpicturePDFReducer.PDFReducerConfiguration.PackDocument = True
'Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf")End Using
Recompressing images
Compress PDF documents by recompressing existing images in a file. For example, decreasing unnecessarily high resolutions can dramatically reduce the file size without affecting the viewing experience.
The example below shows how to recompress images:
using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer();// Configure the metadata of the resulting PDF document.gdpicturePDFReducer.PDFReducerConfiguration.Author = "Nutrient .NET PDF Reducer SDK";gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14";gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "Nutrient";gdpicturePDFReducer.PDFReducerConfiguration.Title = "Re-Compress Images";
// Specify the version and the conformance level of the output PDF document.gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting;
// Recompress images to obtain a better compression ratio.gdpicturePDFReducer.PDFReducerConfiguration.RecompressImages = true;gdpicturePDFReducer.PDFReducerConfiguration.ImageQuality = PDFReducerImageQuality.ImageQualityHigh;
// Reduce the image size by decreasing the image resolution.gdpicturePDFReducer.PDFReducerConfiguration.DownscaleImages = true;gdpicturePDFReducer.PDFReducerConfiguration.DownscaleResolution = 200;
// Run the compression process.gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer() 'Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "Nutrient .NET PDF Reducer SDK" gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14" gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "Nutrient" gdpicturePDFReducer.PDFReducerConfiguration.Title = "Re-Compress Images"
'Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting
'Recompress images to obtain a better compression ratio. gdpicturePDFReducer.PDFReducerConfiguration.RecompressImages = True gdpicturePDFReducer.PDFReducerConfiguration.ImageQuality = PDFReducerImageQuality.ImageQualityHigh
'Reduce the image size by decreasing the image resolution. gdpicturePDFReducer.PDFReducerConfiguration.DownscaleImages = True gdpicturePDFReducer.PDFReducerConfiguration.DownscaleResolution = 200
'Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf")End Using
Controlling image compression
The PDF specification allows for seven compression schemes, all of which can be used to compress images. For example, two popular compression schemes are the following:
- JBIG2 for bitonal images (usually black and white).
- JPEG2000 for 24-bit color and 8-bit grayscale images.
The example below uses both of these schemes to compress images in a PDF document:
using GdPicturePDFReducer gdpicturePDFReducer = new GdPicturePDFReducer();// Configure the metadata of the resulting PDF document.gdpicturePDFReducer.PDFReducerConfiguration.Author = "Nutrient .NET PDF Reducer SDK";gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14";gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "Nutrient";gdpicturePDFReducer.PDFReducerConfiguration.Title = "Image Compression";
// Specify the version and the conformance level of the output PDF document.gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting;
// Enable automatic color detection.gdpicturePDFReducer.PDFReducerConfiguration.EnableColorDetection = true;
// Repair characters.gdpicturePDFReducer.PDFReducerConfiguration.EnableCharRepair = true;
// Control image compression.gdpicturePDFReducer.PDFReducerConfiguration.EnableJPEG2000 = true;gdpicturePDFReducer.PDFReducerConfiguration.EnableJBIG2 = true;gdpicturePDFReducer.PDFReducerConfiguration.JBIG2PMSThreshold = 0.65f;
// Run the compression process.gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf");
Using gdpicturePDFReducer As GdPicturePDFReducer = New GdPicturePDFReducer() 'Configure the metadata of the resulting PDF document. gdpicturePDFReducer.PDFReducerConfiguration.Author = "Nutrient .NET PDF Reducer SDK" gdpicturePDFReducer.PDFReducerConfiguration.Producer = "GdPicture.NET 14" gdpicturePDFReducer.PDFReducerConfiguration.ProducerName = "Nutrient" gdpicturePDFReducer.PDFReducerConfiguration.Title = "Image Compression"
'Specify the version and the conformance level of the output PDF document. gdpicturePDFReducer.PDFReducerConfiguration.OutputFormat = PDFReducerPDFVersion.PdfVersionRetainExisting
'Enable automatic color detection. gdpicturePDFReducer.PDFReducerConfiguration.EnableColorDetection = True
'Repair characters. gdpicturePDFReducer.PDFReducerConfiguration.EnableCharRepair = True
'Control image compression. gdpicturePDFReducer.PDFReducerConfiguration.EnableJPEG2000 = True gdpicturePDFReducer.PDFReducerConfiguration.EnableJBIG2 = True gdpicturePDFReducer.PDFReducerConfiguration.JBIG2PMSThreshold = 0.65F
'Run the compression process. gdpicturePDFReducer.ProcessDocument(@"C:\temp\source.pdf", @"C:\temp\output.pdf")End Using