Capturing Text from Images (OCR)

Advanced Process Automation enables you to capture text from images using the Optical Recognition Method (OCR). OCR is a process by which specialized software is used to convert scanned images of text to electronic text so that digitized texts can be searched, indexed and retrieved. Use cases include:

Looking for specific words in an image

Looking for the entire text in an image

Getting the number of words in the image

Getting the number of words and their coordinates from an image. See Getting the Number of Words and Coordinates.

Getting text from a PDF document

Getting the tables from an image or PDF and the coordinates of the table's cells

The following sections describe different objects that you can use to capture text depending on whether FineReader is installed with the appropriate license. If installed, you can choose which object to use to recognize images. For Scenes, an advanced OCR engine is used by default. If the advanced engine is not installed, then Nicomsoft is used. The following objects are available in the Designer:

Picture: Uses the Nicomsoft OCR engine to capture text from PDF documents, images (JPG, BMP, TIFF, GIF, PNG), or user interfaces of remote applications that are running (for example, Citrix sessions). See Using the Picture Object.

When using this option, the PDF should be opened on the relevant page, and only text displayed on the desktop is captured. There is no option to get tables from an image or PDF using this option.

Advanced Picture: Uses the Advanced OCR engine to capture text and tables from images (JPG, BMP, TIFF, GIF, PNG), scenes and screen elements, or user interfaces of remote applications that are running. See Using the Advanced Picture Object.

Advanced PDF: Uses the Advanced OCR engine to capture text and tables from PDF documents (by pages, without opening the PDF). See Using the Advanced PDF Object.