Extract data from PDF form fields in C# .NET
To extract data from form fields in a PDF document, follow the steps below:
- Create a
GdPicturePDF
object. - Load the PDF file with the
LoadFromFile
method. - Get the total number of form fields in the PDF document with the
GetFormFieldsCount
method. - Loop through all the form fields.
- Use any method to get the form field data and save it to a variable. For more information, refer to the read form fields guide.
The following code example gets the field ID, type, and page location of all form fields and saves them to a CSV file:
using GdPicturePDF gdpicturePDF = new GdPicturePDF();
// Create a `StringBuilder` variable to store data.StringBuilder data = new StringBuilder();// Add headers to the first line.String[] headers = {"Field ID", "Field Type", "Field Location Page"};data.AppendLine(string.Join(";", headers));
gdpicturePDF.LoadFromFile(@"C:\temp\source.pdf");// Get the form field count.int fieldCount = gdpicturePDF.GetFormFieldsCount();// Loop through all form fields.for (int i = 0; i < fieldCount; i++){ // Get the field ID, type, and page location of each form field. int fieldID = gdpicturePDF.GetFormFieldId(i); PdfFormFieldType fieldType = gdpicturePDF.GetFormFieldType(fieldID); int location = gdpicturePDF.GetFormFieldPage(fieldID); // Add a new line to the `StringBuilder` with the form field data. String[] newLine = { fieldID.ToString(), fieldType.ToString(), location.ToString() }; data.AppendLine(string.Join(";", newLine));}// Save the collected data to a CSV file.String formData = @"C:\temp\output.csv";File.AppendAllText(formData, data.ToString());
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF() ' Create a `StringBuilder` variable to store data. Dim data As StringBuilder = New StringBuilder() ' Add headers to the first line. Dim headers = {"Field ID", "Field Type", "Field Location Page"} data.AppendLine(String.Join(";", headers))
gdpicturePDF.LoadFromFile("C:\temp\source.pdf") ' Get the form field count. Dim fieldCount As Integer = gdpicturePDF.GetFormFieldsCount() ' Loop through all form fields. For i = 0 To fieldCount - 1 ' Get the field ID, type, and page location of each form field. Dim fieldID As Integer = gdpicturePDF.GetFormFieldId(i) Dim fieldType As PdfFormFieldType = gdpicturePDF.GetFormFieldType(fieldID) Dim location As Integer = gdpicturePDF.GetFormFieldPage(fieldID) ' Add a new line to the `StringBuilder` with the form field data. Dim newLine As String() = {fieldID.ToString(), fieldType.ToString(), location.ToString()} data.AppendLine(String.Join(";", newLine)) Next ' Save the collected data to a CSV file. Dim formData = "C:\temp\output.csv" File.AppendAllText(formData, data.ToString())End Using
Related topics