Extract PDF attachments with C#

This guide demonstrates how to extract embedded attachments from PDF documents using C# and Nutrient Document Converter Services (DCS). You can extract various types of embedded files including images (PNG, JPG, GIF), documents (PDF, DOCX, XLSX), and other file types that are attached to or embedded within PDF documents.

Common use cases

PDF attachment extraction is useful for:

  • Digital asset recovery - Extract images and documents from PDF portfolios for separate processing
  • Content auditing - Inventory all embedded files within PDF documents for compliance
  • Batch processing workflows - Automatically extract attachments from multiple PDFs for further analysis
  • Document migration - Recover embedded files when migrating from PDF-based storage systems
  • Legal discovery - Extract all embedded content from PDF documents for legal review processes

Prerequisites

Before extracting PDF attachments, ensure you have:

  • Nutrient Document Converter Services (DCS) installed, licensed, and running
  • .NET Framework or .NET Core development environment
  • Implemented OpenService() and CloseService() methods from DocumentConverterServiceClient sample code
  • PDF files containing embedded attachments for testing
  • Write permissions for the target output folder

Input requirements

  • PDF files with embedded attachments (not just visual images on pages)
  • File path accessible to the application
  • PDF files that are not password-protected or corrupted

Output format

  • Original embedded files are extracted with their original filenames and extensions
  • Files are saved to the specified target folder
  • Supported extraction formats include common file types (images, documents, spreadsheets)
  • Unsupported or corrupted embedded files are skipped with console warnings

Sample code

The following example demonstrates how to extract all embedded attachments from a PDF file and save them to a target folder:

/// <summary>
/// Extract attachments from a PDF file.
/// </summary>
/// <param name="ServiceURL">URL endpoint for the PDF Converter service.</param>
/// <param name="sourceFileName">Source filename.</param>
/// <param name="targetFolder">Target folder to receive the output file.</param>
static void ExtractAttachmentsFromFile(string ServiceURL, string sourceFileName, string targetFolder)
{
Console.WriteLine($"Extracting attachments from {sourceFileName}");
// Open a service client.
DocumentConverterServiceClient client = null;
// Create an `OpenOptions` instance with minimum properties.
OpenOptions openOptions = new OpenOptions();
openOptions.FileExtension = Path.GetExtension(sourceFileName);
openOptions.OriginalFileName = Path.GetFileName(sourceFileName);
try
{
// Read the source file into a byte array.
byte[] sourceFile = File.ReadAllBytes(sourceFileName);
// Open the service.
client = OpenService(ServiceURL);
// Perform the conversion.
BatchResults results = client.ExtractEmbeddedFiles(sourceFile, openOptions);
// Check if there are any results. If not, show a console message.
if (results == null)
{
Console.WriteLine($"No results returned");
}
else
{
// If the target folder does not exist, create it.
if (!Directory.Exists(targetFolder))
{
Directory.CreateDirectory(targetFolder);
}
Console.WriteLine($"Output to: {targetFolder}");
// For each result returned.
foreach (BatchResult result in results.Results)
{
string filename = result.FileName;
Console.WriteLine(filename);
// Write the file contents to the new output file.
File.WriteAllBytes(Path.Combine(targetFolder, filename), result.File);
}
}
}
catch (Exception ex)
{
Console.WriteLine($"{ex.Message}");
}
finally
{
// If the client has been opened, close it.
if (client != null)
{
CloseService(client);
}
}
}
extract-attachments

Troubleshooting

No attachments found: Results return null or empty

  • Verify that the PDF contains embedded attachments, not just images that are part of the page content
  • Check that the PDF is not corrupted or password-protected
  • Ensure the PDF was created with embedded files rather than just visual content

Service connection error: Cannot connect to DCS

  • Ensure Nutrient Document Converter Services is running and accessible
  • Verify the service URL in your code matches your DCS installation
  • Check that no firewall is blocking the connection

File access error: Permission denied

  • Verify that the application has read access to the source PDF file
  • Check that the target folder has write permissions
  • Ensure the PDF file isn’t locked by other applications

Large file processing: Slow performance or timeouts

  • For PDFs with many attachments, consider implementing progress tracking
  • Increase timeout values for the service client if processing large files
  • Monitor memory usage when extracting multiple large attachments

Unsupported file types: Some attachments not extracted

  • DCS can extract most common file types embedded in PDFs
  • Unsupported or corrupted embedded files will be skipped
  • Check the console output for warnings about skipped files

What’s next

Now that you can extract attachments from PDFs with C#, explore these related document processing capabilities: