Features – AutoOCR processing / monitoring folders:
1.) Processing input folders / structures: Here an input folder or an entire folder structure is processed. The generated PDF files are stored in the same folder structure with the same name as the original file. However, a special case, PDF files, as there are PDF files which do not require OCR processing and others which require one. It may also happen that only certain pages of a PDF file need to OCR processing.
In order not to process the PDF’s again that have already been processed by AutoOCR files are indicated in the data structure by a “label”.
At the start of the service AutoOCR the folder structure completely scanned to identify backlogged files. Each PDF file needs to be checked for this “label”. It should be noted that with large data sets, this process takes a long time since any PDF file must be opened and checked.
2.) retain date / time of the original file: With this option, the date and time of creation, modification, and last access can be transmitted from the source file to the generated by OCR PDF file. The PDF document is thus replaced without changing the attributes.
3.) Smart OCR processing of PDF files: PDFs can be pure image files without text, “normal” PDF files that already contain text or be mixed documents. Here individual pages are scanned image files with no text and the remaining pages are normal PDF content with text. Without special functionality always the whole PDF document will be OCR processed and so all pages regardless of the content. This takes time, resources and increases the PDF files unnecessarily. That is why you should activate the “intelligent OCR processing“. Only those documents and pages OCR be processed, where it is necessary. “Normal” PDF files are not processed, but only marked with a “label” – see 1.).
4.) Folder Monitor – File System Events / block processing: If it is required that during the current processing newly added files are immediately detected and processed, so the “File System Events” must be selected. If selected “block processing“, so newly added files are not automatically detected. The “block processing” is specifically designed for the initial processing of large volumes of documents. After the initial processing should then be switched to “file system events” so that newly added files are immediately processed. If the AutoOCR service stopped and restarted, the complete folder structure is searched for unprocessed files first always.
5.) Process files / folders from network shares: After installing the AutoOCR service runs by default as “Local System Account”. Must to files and folders are handled by the network shares, allowed so you have to create a “user account” to be used for the AutoOCR service which also has the appropriate rights to access the network shares used access.