AutoOCR – Folder processing / monitoring – what to consider?

Features – AutoOCR processing / monitoring folders:

1.) Processing input folders / structures: Here an input folder or an entire folder structure is processed. The generated PDF files are stored in the same folder structure with the same name as the original fileHowever, a special case, PDF files, as there are PDF files which do not require OCR processing and others which require one. It may also happen that only certain pages of a PDF file need to OCR processing.

In order not to process the PDF’s again that have already been processed by AutoOCR files are indicated in the data structure by a label”.

At the start of the service AutoOCR the folder structure completely scanned to identify backlogged files. Each PDF file needs to be checked for this “label”. It should be noted that with large data sets, this process takes a long time since any PDF file must be opened and checked.

Verarbeitung von Ordnderstrukturen - Ersetzen der Ursprungsdateien

2.) retain date / time of the original file: With this option, the date and time of creation, modification, and last access can be transmitted from the source file to the generated by OCR PDF file. The PDF document is thus replaced without changing the attributes.

Option um die Datums und Uhrzeit der Urspungsdatei zu erhalten

3.) Smart OCR processing of PDF files: PDFs can be pure image files without text, normal” PDF files that already contain text or be mixed documents. Here individual pages are scanned image files with no text and the remaining pages are normal PDF content with text. Without special functionality always the whole PDF document will be OCR processed and so all pages regardless of the content. This takes time, resources and increases the PDF files unnecessarily. That is why you should activate the intelligent OCR processing. Only those documents and pages OCR be processed, where it is necessary. Normal” PDF files are not processed, but only marked with a label” see 1.).

iOCR - Intelligente OCR Verarbeitung  Abbyy - Option intelligente OCR Verarbeitung

4.) Folder Monitor File System Events / block processing: If it is required that during the current processing newly added files are immediately detected and processed, so the “File System Eventsmust be selectedIf selected block processing“, so newly added files are not automatically detected. The block processingis specifically designed for the initial processing of large volumes of documents. After the initial processing should then be switched to file system eventsso that newly added files are immediately processed. If the AutoOCR service stopped and restarted, the complete folder structure is searched for unprocessed files first always.

AutoOCR - Ordnerüberwachung - über Events oder Blockweise

5.) Process files / folders from network shares: After installing the AutoOCR service runs by default as “Local System Account”. Must to files and folders are handled by the network shares, allowed so you have to create a user account” to be used for the AutoOCR service which also has the appropriate rights to access the network shares used access.

AutoOCR - Service Accout Konfiguration

Webshop