Extract PDF attachments with C#
This guide demonstrates how to extract embedded attachments from PDF documents using C# and Nutrient Document Converter Services (DCS). You can extract various types of embedded files including images (PNG, JPG, GIF), documents (PDF, DOCX, XLSX), and other file types that are attached to or embedded within PDF documents.
Common use cases
PDF attachment extraction is useful for:
- Digital asset recovery - Extract images and documents from PDF portfolios for separate processing
- Content auditing - Inventory all embedded files within PDF documents for compliance
- Batch processing workflows - Automatically extract attachments from multiple PDFs for further analysis
- Document migration - Recover embedded files when migrating from PDF-based storage systems
- Legal discovery - Extract all embedded content from PDF documents for legal review processes
Prerequisites
Before extracting PDF attachments, ensure you have:
- Nutrient Document Converter Services (DCS) installed, licensed, and running
- .NET Framework or .NET Core development environment
- Implemented
OpenService()
andCloseService()
methods fromDocumentConverterServiceClient
sample code - PDF files containing embedded attachments for testing
- Write permissions for the target output folder
Input requirements
- PDF files with embedded attachments (not just visual images on pages)
- File path accessible to the application
- PDF files that are not password-protected or corrupted
Output format
- Original embedded files are extracted with their original filenames and extensions
- Files are saved to the specified target folder
- Supported extraction formats include common file types (images, documents, spreadsheets)
- Unsupported or corrupted embedded files are skipped with console warnings
Sample code
The following example demonstrates how to extract all embedded attachments from a PDF file and save them to a target folder:
/// <summary> /// Extract attachments from a PDF file. /// </summary> /// <param name="ServiceURL">URL endpoint for the PDF Converter service.</param> /// <param name="sourceFileName">Source filename.</param> /// <param name="targetFolder">Target folder to receive the output file.</param> static void ExtractAttachmentsFromFile(string ServiceURL, string sourceFileName, string targetFolder) { Console.WriteLine($"Extracting attachments from {sourceFileName}");
// Open a service client. DocumentConverterServiceClient client = null;
// Create an `OpenOptions` instance with minimum properties. OpenOptions openOptions = new OpenOptions(); openOptions.FileExtension = Path.GetExtension(sourceFileName); openOptions.OriginalFileName = Path.GetFileName(sourceFileName); try {
// Read the source file into a byte array. byte[] sourceFile = File.ReadAllBytes(sourceFileName);
// Open the service. client = OpenService(ServiceURL);
// Perform the conversion. BatchResults results = client.ExtractEmbeddedFiles(sourceFile, openOptions);
// Check if there are any results. If not, show a console message. if (results == null) { Console.WriteLine($"No results returned"); } else { // If the target folder does not exist, create it. if (!Directory.Exists(targetFolder)) { Directory.CreateDirectory(targetFolder); } Console.WriteLine($"Output to: {targetFolder}"); // For each result returned. foreach (BatchResult result in results.Results) { string filename = result.FileName; Console.WriteLine(filename); // Write the file contents to the new output file. File.WriteAllBytes(Path.Combine(targetFolder, filename), result.File); } } } catch (Exception ex) { Console.WriteLine($"{ex.Message}"); } finally { // If the client has been opened, close it. if (client != null) { CloseService(client); } } }

Troubleshooting
No attachments found: Results return null or empty
- Verify that the PDF contains embedded attachments, not just images that are part of the page content
- Check that the PDF is not corrupted or password-protected
- Ensure the PDF was created with embedded files rather than just visual content
Service connection error: Cannot connect to DCS
- Ensure Nutrient Document Converter Services is running and accessible
- Verify the service URL in your code matches your DCS installation
- Check that no firewall is blocking the connection
File access error: Permission denied
- Verify that the application has read access to the source PDF file
- Check that the target folder has write permissions
- Ensure the PDF file isn’t locked by other applications
Large file processing: Slow performance or timeouts
- For PDFs with many attachments, consider implementing progress tracking
- Increase timeout values for the service client if processing large files
- Monitor memory usage when extracting multiple large attachments
Unsupported file types: Some attachments not extracted
- DCS can extract most common file types embedded in PDFs
- Unsupported or corrupted embedded files will be skipped
- Check the console output for warnings about skipped files
What’s next
Now that you can extract attachments from PDFs with C#, explore these related document processing capabilities:
- Table extraction - Discover extract tabular data from PDFs for data processing workflows
- Python implementation - Compare approaches with extract text using Python for cross-language insights
- Document conversion - Explore document conversion with C# to transform between different formats