Extract images from PDFs in C#
This guide explains how to extract images from PDF documents using C#. Images can be added to a PDF document in the following ways:
- Embedded in the internal structure of the PDF document.
- Added to the PDF document as an image annotation.
Nutrient .NET SDK (formerly GdPicture.NET) currently enables you to extract images embedded in a PDF document. Extracting images from image annotations isn’t supported.
To extract images embedded in a PDF document, follow the steps below:
- Create a
GdPicturePDF
object and aGdPictureImaging
object. - Select the source document by passing its path to the
LoadFromFile
method of theGdPicturePDF
object. - Determine the number of pages with the
GetPageCount
method of theGdPicturePDF
object and loop through them. - Determine the number of images on the page with the
GetPageImageCount
method of theGdPicturePDF
object and loop through them. - Extract the image by passing the index of the image to the
ExtractPageImage
method of theGdPicturePDF
object. - Save the output in a new image file with the
SaveAsPNG
method of theGdPictureImaging
object. - Release unnecessary resources.
The example below extracts all embedded images from a PDF document:
using GdPicturePDF gdpicturePDF = new GdPicturePDF();using GdPictureImaging gdpictureImaging = new GdPictureImaging();// Select the source document.gdpicturePDF.LoadFromFile(@"C:\temp\source.pdf");// Determine the number of pages and loop through them.int pageCount = gdpicturePDF.GetPageCount();for (int page = 1; page <= pageCount; page++){ gdpicturePDF.SelectPage(page); // Determine the number of images on the page and loop through them. int imageCount = gdpicturePDF.GetPageImageCount(); for (int imageIndex = 0; imageIndex < imageCount; imageIndex++) { // Extract the image. int imageId = gdpicturePDF.ExtractPageImage(imageIndex); // Save the output in a new image file. gdpictureImaging.SaveAsPNG(imageId, @"C:\temp\page-" + page + "-image-" + imageIndex + ".png"); // Release unnecessary resources. gdpictureImaging.ReleaseGdPictureImage(imageId); }}
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF()Using gdpictureImaging As GdPictureImaging = New GdPictureImaging() ' Select the source document. gdpicturePDF.LoadFromFile("C:\temp\source.pdf") ' Determine the number of pages and loop through them. Dim pageCount As Integer = gdpicturePDF.GetPageCount() For page = 1 To pageCount gdpicturePDF.SelectPage(page) ' Determine the number of images on the page and loop through them. Dim imageCount As Integer = gdpicturePDF.GetPageImageCount() For imageIndex = 0 To imageCount - 1 ' Extract the image. Dim imageId As Integer = gdpicturePDF.ExtractPageImage(imageIndex) ' Save the output in a new image file. gdpictureImaging.SaveAsPNG(imageId, "C:\temp\page-" & page & "-image-" & imageIndex & ".png") ' Release unnecessary resources. gdpictureImaging.ReleaseGdPictureImage(imageId) Next NextEnd UsingEnd Using
Used methods
Related topics