Extract data from bank statements using C#

Nutrient .NET SDK’s (formerly GdPicture.NET) key-value pair (KVP) extraction engine enables you to recognize related data items in a document and export them to an external destination such as a spreadsheet.

To extract data items from a bank statement, follow the steps below:

  1. Create a GdPictureOCR object and a GdPictureImaging object.
  2. Select the bank statement by passing its path to the CreateGdPictureImageFromFile method of the GdPictureImaging object.
  3. Configure the OCR process with the GdPictureOCR object in the following way:
    • Set the bank statement with the SetImage method.
    • Set the path to the OCR resource folder with the ResourceFolder property. The default language resources are located in GdPicture.NET 14\Redist\OCR. For more information on adding language resources, see the language support guide.
    • With the AddLanguage method, add the language resources that Nutrient .NET SDK uses to recognize text in the image. This method takes a member of the OCRLanguage enumeration.
  4. Run the OCR process with the RunOCR method of the GdPictureOCR object.
  5. Get the number of key-value pairs detected during the OCR process with the GetKeyValuePairCount method of the GdPictureOCR object, and loop through them.
  6. Get the key-value pairs, the data types, and the confidence scores with the following methods:
  7. Write the output to the console.
  8. Release unnecessary resources.

The example below retrieves key-value pairs from the following bank statement.

Sample bank statement

Download the sample bank statement and run the code below, or check out our demo.

=

using GdPictureOCR gdpictureOCR = new GdPictureOCR();
using GdPictureImaging gdpictureImaging = new GdPictureImaging();
// Load the source document.
int imageId = gdpictureImaging.CreateGdPictureImageFromFile(@"C:\temp\source.png");
// Configure the OCR process.
gdpictureOCR.ResourceFolder = @"C:\GdPicture.NET 14\Redist\OCR";
gdpictureOCR.AddLanguage(OCRLanguage.English);
gdpictureOCR.SetImage(imageId);
// Run the OCR process.
string ocrResultId = gdpictureOCR.RunOCR();
string keyValuePairsData = "";
for (int pairIndex = 0; pairIndex < gdpictureOCR.GetKeyValuePairCount(ocrResultId); pairIndex++)
{
keyValuePairsData += $"| Key: {gdpictureOCR.GetKeyValuePairKeyString(ocrResultId, pairIndex)} | " +
$"Value: {gdpictureOCR.GetKeyValuePairValueString(ocrResultId, pairIndex)} | " +
$"Document Type: {gdpictureOCR.GetKeyValuePairDataType(ocrResultId, pairIndex).ToString()} | " +
$"Confidence Level: {Math.Round(gdpictureOCR.GetKeyValuePairConfidence(ocrResultId, pairIndex), 1).ToString()}% |\n";
}
// Write the output to the console.
Console.WriteLine(keyValuePairsData);
// Release unnecessary resources.
gdpictureImaging.ReleaseGdPictureImage(imageId);
gdpictureOCR.ReleaseOCRResults();

=

Format the output to obtain the following table:

KeyValueDocument typeConfidence level
IBANFR7611808009101234567890147IBAN100%
Phone786-315-0313PhoneNumber100%
BIC12345678901Number66.4%
Bank code11808Number99.4%
Counter code00914Number100%
Number account12345678901Number99.3%
Bank key47Number74.2%
River bank100Number74%
Account ownerDavid BricklaneString100%
DomiciliationEast Bank SummerfieldString97.5%