Enhancements and bug fixes for document tagging

Version 1.4.2509.16

Enhancements

Added support for OAuth2 authentication for email

Ref: AT-67

The system now uses Open Authorization 2.0 (OAuth2) to authenticate email, replacing the deprecated Simple Mail Transfer Protocol (SMTP) authentication. This ensures more secure and modern authentication for connected mail services.

Migrated SharePoint authentication to Microsoft Authentication Library (MSAL)

Ref: AT-71

SharePoint authentication has been migrated from the deprecated Active Directory Authentication Library (ADAL) to the supported Microsoft Authentication Library (MSAL). This update enhances compatibility and ensures continued support for Microsoft’s latest authentication standards.

Version 1.2.2204.08

Bug fixes

Resolved FIPS compliance issue in cryptographic algorithms

Ref: AT-57

The implementation now uses cryptographic algorithms that are part of the Windows Platform Federal Information Processing Standards (FIPS) validated list. This ensures compliance with FIPS security requirements.

Fixed incomplete tagging in SharePoint on-premises site collections

Ref: AT-58

The Tagger now correctly tags entire SharePoint on-premises site collections when running in VMWare environments. This resolves the previous issue where only partial tagging occurred.

Version 1.2.2201.18

Bug fixes

Resolved issue with root SharePoint URLs not being processed

Ref: AT-55

The system now correctly processes root SharePoint URLs, ensuring that all site collections and content at the root level are handled as expected.

Fixed UnlimitedLengthInDocumentLibrary initialization exception

Ref: AT-56

The previous exception caused by the UnlimitedLengthInDocumentLibrary parameter not being initialized has been resolved. This ensures stable operation when working with document libraries of varying sizes.

Version 1.2.2111.30

Bug fixes

Fixed deletion of certificate keys after use

Ref: AT-54

The system now correctly deletes certificate keys from ProgramData\Microsoft\Crypto\RSA\MachineKeys after use. This resolves the previous issue where keys were left behind, in turn improving security and system hygiene.

Version 1.2.2111.22

Enhancements

Added support for multiline text SharePoint fields

Ref: AT-46

The system now supports multiline text fields in SharePoint, ensuring accurate handling of larger or formatted text content.

Added option to tag terms with the highest frequency first

Ref: AT-48

You can now configure Tagger to prioritize tagging terms with the highest frequency in a document. To enable this, set tagItemsWithTheHighestFrequencyFirst to true in Tagger.config, and restart the Tagger service. This improves tagging accuracy for frequently occurring terms.

Added support for SharePoint Online interactive authentication

Ref: AT-52

The system now supports interactive authentication for SharePoint Online, enabling seamless access in environments requiring user credentials or multi-factor authentication.

Added support for SharePoint app-only authentication

Ref: AT-53

SharePoint app-only authentication is now supported, allowing automated processes to access SharePoint resources securely without user credentials.

Bug fixes

Fixed issue with filtering documents by creation date

Ref: AT-45

The document filter now correctly processes creation dates, ensuring accurate results when filtering by this parameter.

Resolved tagging issues for terms containing ampersands

Ref: AT-49

Terms containing ampersands are now correctly tagged, preventing skipped or incomplete tagging of these terms.

Fixed missing results for low-frequency terms

Ref: AT-50

Terms that appear only once in a document are now correctly tagged when tagItemsWithTheHighestFrequencyFirst is set to true and/or minWordFrequency is set to 1. This ensures comprehensive term tagging.

Fixed scanning issues with unselected barcodes

Ref: AT-51

Tagger now correctly ignores barcodes that haven’t been selected, preventing unnecessary scans and improving performance.

Version 1.1.200615

Enhancements

Replaced barcode engine to improve recognition

Ref: AT-39

The system now uses an updated barcode engine, significantly improving barcode recognition accuracy and reliability across supported formats.

Version 1.1.200511

Upgrading from previous versions

We highly recommend creating a backup of your database before upgrading. The database is located at [installation path]\data\Tagger.db and can be very large, depending on the number of runs.

Enhancements

Added support for modern authentication using Azure AD App-Only

Ref: AT-13

The system now supports modern authentication with Azure Active Directory App-Only. This enables secure access to SharePoint without requiring user credentials. To configure, follow these steps:

  • Create a self-signed certificate.
  • Register a test web app in Azure (no coding required).
  • Grant the web app permissions to access the SharePoint tenant.
  • Connect the certificate to the web app.

For more information, refer to the Microsoft documentation(opens in a new tab).

Updated Microsoft Natural Language Processing (NLP) service

Ref: AT-20

The NLP service now uses the latest Microsoft Cognitive Services. Previously, only keywords could be retrieved using the Keywords NLP entity in Tagger; now all entities can be retrieved. New configuration settings in Tagger.config include:

  • microsoftNLPServiceEndpoint — Set to the endpoint of a custom Azure Cognitive Service.
  • microsoftNLPServiceLanguage — Set the language of documents being processed.

SharePoint query retries are now only performed for specific HttpStatusCodes, defined in the httpStatusCodesToRetry setting. Retry count and delay are controlled via webRequestRetries. The Tagger service must be restarted after updating these settings.

Retry settings

Improved TLS support for SharePoint servers

Tagger now supports SharePoint servers using TLS 1.1 and 1.2. Additional cryptographic protocol options have been added in Tagger.config.

Added configuration for handling invalid SSL certificates

Accessing servers with invalid SSL certificates now supports two configuration options:

  • recognizedCertificateThumbprints — Add SHA-1 thumbprints of trusted certificates; multiple thumbprints can be separated by commas.
  • ignoreAllCertificateErrors — Set to true to ignore all certificate errors.

Setting values of `recognizedCertificateThumbprints` and `ignoreAllCertificateErrors`

The Tagger UI and service must be restarted after making these changes.

Updated SharePoint CSOM and barcode reader versions

  • SharePoint Client-Side Object Model (CSOM) has been updated from version 15 to 16.
  • Barcode reader has been updated from version 2.2 to 2.3, improving barcode recognition.

Bug fixes

Fixed Retain Modified By setting

Ref: AT-11

The Retain Modified By setting now functions correctly when applied to documents.

Resolved issue with read-only SharePoint columns

Ref: AT-16

Read-only SharePoint columns are now correctly considered when using Filter Documents by Regular Expression.

Updated Tika to fix text extraction issues

Ref: AT-21

Tika has been updated to the latest version to resolve issues extracting text from certain document types.

Fixed issue adding new locations to existing jobs

Ref: AT-25

Users can now successfully add new locations to existing jobs without errors.

Fixed PDF417 barcode decoding

Ref: AT-26

PDF417 barcodes are now decoded correctly, resolving previous errors during processing.

Fixed scheduler behavior after reboot or service restart

Ref: AT-31

The scheduler no longer immediately runs all scheduled jobs after a reboot or service restart, respecting the intended time intervals between runs.

Improved Job tab handling for empty libraries

The Job tab is now greyed out if no jobs have been created. Users should add new jobs using the Add new job button on the Dashboard. This prevents errors caused by attempting to add locations to non-existent jobs.