Category: AutoOCR

AutoOCR Version 1.15.3 available

Innovations AutoOCR Version 1.15.3:

  • New iOCR EngineWe replaced the previous standard iOCR engine with a new product – vsOCR –. This results in a better detection rate as well as a significantly better performance for multicore / multiprocessor computers. With the new OCR Engine, we now support parallel / multithread processing with multi-page TIFF and PDF documents. The OCR processing speed is thereby multiplied, if, for example, 4 or 8 cores are available.

Setup Option - iOCR herunterladen und installieren

  • iOCR – PDF rendering resolution configurableSince only image / image documents can be processed by OCR, PDF documents are always subjected to an image (rendering) prior to OCR processing. There is now the possibility to configure the rendering resolution for SW and color, whereby the default value for SW and color is 300dpi.

iOCR - Option - PDF Rendering resolution for b&w and color

  • Abbyy OCR – New default settingsBased on our experience so far, we have redefined the default settings to achieve the best possible recognition rate as well as the highest possible OCR performance. A single option can affect the processing speed, especially for multi-page documents with a lot of text by a factor of 5 or 10 or moreA 10 page document can be stored in either 10 sec. or in 5 min. depending on whether the “Recognize font formatting” option is enabled or not.

AutoOCR - Abbyy Standardeinstellungen #1  AutoOCR - Abbyy Standardeinstellungen #2  AutoOCR - Abbyy Standardeinstellungen #3  AutoOCR - Abbyy Standardeinstellungen #4  AutoOCR - Abbyy Standardeinstellungen #5  AutoOCR - Abbyy Standardeinstellungen #6

  • Remove black border –  Was added as a new general image processing function for iOCR and Abbyy. Thus, a possible black border is detected and removed in all documents before OCR processing. The page size is not changed.

Neue Option - schwarzen Rand entfernen

  • Configure an invalid license responseStopping the service (default value) or demo stamp on the document.

Reaktion falls die Lizenz ungültig ist

  • Further adjustmentsAutostart of the AutoOCR User Interface – is now activated by default. Error while creating the optional TXT file with iOCR has been fixed. Read-only PDF documents do not produce endless loops when processing. The temporary Abbyy Folder is now correctly deleted after the set number of days. Language-specific special characters are now encoded correctly with the Abbyy PDF/A output.

Download – AutoOCR – OCR Server ohne iOCR (vsOCR) Engine (ca. 10MB) >>>

Download – iOCR (vsOCR) Engine (ca. 270MB) >>>

See also:AutoOCR – Installation requirements from version 1.15.3

For the Abbyy OCR Engine version 10 demo licenses are available for 30 days or 500 pages – you can request them per mail

Download – Abbyy FineReader 10.x Rel 4 OCR Engine Setup (ca. 460MB) >>>

AutoOCR – Folder processing / monitoring – what to consider?

Features – AutoOCR processing / monitoring folders:

1.) Processing input folders / structures: Here an input folder or an entire folder structure is processed. The generated PDF files are stored in the same folder structure with the same name as the original fileHowever, a special case, PDF files, as there are PDF files which do not require OCR processing and others which require one. It may also happen that only certain pages of a PDF file need to OCR processing.

In order not to process the PDF’s again that have already been processed by AutoOCR files are indicated in the data structure by a label”.

At the start of the service AutoOCR the folder structure completely scanned to identify backlogged files. Each PDF file needs to be checked for this “label”. It should be noted that with large data sets, this process takes a long time since any PDF file must be opened and checked.

Verarbeitung von Ordnderstrukturen - Ersetzen der Ursprungsdateien

2.) retain date / time of the original file: With this option, the date and time of creation, modification, and last access can be transmitted from the source file to the generated by OCR PDF file. The PDF document is thus replaced without changing the attributes.

Option um die Datums und Uhrzeit der Urspungsdatei zu erhalten

3.) Smart OCR processing of PDF files: PDFs can be pure image files without text, normal” PDF files that already contain text or be mixed documents. Here individual pages are scanned image files with no text and the remaining pages are normal PDF content with text. Without special functionality always the whole PDF document will be OCR processed and so all pages regardless of the content. This takes time, resources and increases the PDF files unnecessarily. That is why you should activate the intelligent OCR processing. Only those documents and pages OCR be processed, where it is necessary. Normal” PDF files are not processed, but only marked with a label” see 1.).

iOCR - Intelligente OCR Verarbeitung  Abbyy - Option intelligente OCR Verarbeitung

4.) Folder Monitor File System Events / block processing: If it is required that during the current processing newly added files are immediately detected and processed, so the “File System Eventsmust be selectedIf selected block processing“, so newly added files are not automatically detected. The block processingis specifically designed for the initial processing of large volumes of documents. After the initial processing should then be switched to file system eventsso that newly added files are immediately processed. If the AutoOCR service stopped and restarted, the complete folder structure is searched for unprocessed files first always.

AutoOCR - Ordnerüberwachung - über Events oder Blockweise

5.) Process files / folders from network shares: After installing the AutoOCR service runs by default as “Local System Account”. Must to files and folders are handled by the network shares, allowed so you have to create a user account” to be used for the AutoOCR service which also has the appropriate rights to access the network shares used access.

AutoOCR - Service Accout Konfiguration

AutoOCR 1.10.17 available

Because PDFs can contain text already and therefore not all the documents / pages to be subjected to OCR processing, we have implemented the intelligent OCR processing. Previously, this feature was only for PDF output.

The Alfresco integration AutoOCR can also be configured for plain text output. Here AutoOCR generates only the text required for the Alfresco full-text search. With the AutoOCR 1.10.17 now the intelligent OCR processing is not only for the PDF, but also for plain text output. So it will be OCR processed only PDF image files. For normal PDF’s the text is extracted directly without OCR processing. This saves time and resources.

Download – AutoOCR – OCR Server incl. iOCR Engine (ca. 150MB) >>>

For the Abbyy OCR Engine Version 10 Demo licenses are available for 30 days or 500 pages – these can be requested from us.

Download – Abbyy FineReader 10.x Rel 4 OCR Engine Setup (ca. 460MB) >>>

DropOCR – version 1.3.2 available

With the DropOCR version 1.3.2 the parallel upload as well as the communication with the AutoOCR server was completely revised. By that all deficiencies of the previous version were fixed. Especielly with large documents with a lot of pages, long processing times and a big amount of documents to be processed there were problems with the processing – not all documents were processed, errors which didn’t occur got logged or the communication with the AutoOCR server was aborted. All of these problems are now fixed with the version 1.3.2.

DropZone & DropOCR icon tray context menu  DropOCR - Konfiguration

Download – DropOCR >>>

OwnCloud integration for FileConverterPro and AutoOCR available

OwnCloud is the most popular and common OpenSource Cloud software which is used for both private- and public-clouds. OwnCloud is available as an OpenSource (community edition) as well as an extended Enterprise version. We also use OwnCloud in our company to make files externally accessible and share them with our partners in a quick and easy way. Files can be synchronized automatically and can also be retrieved via mobile apps on the smartphone or tablet.

The functionality of OwnCloud can be extendes with server apps and there is already a great amount of different apps for various scopes of application which are available to download for free.

On this basis, we developed an integration between AutoOCR and our FileConverterPro server / service. With it most Office, CAD and e-mail document formats can be converted to searchable PDF or PDF/A-1b, PDF/A-3b automatically or interactive / manually  incl. OCR directly from the OwnCloud.

1_AutoOCR & FileConverterPro Plugin für OwnCloud  2_Über die Admin Funktionen erfolgt die Konfiguration  3.1_Über das MIME-Type Mapping kann eine automatische Konvertierung nach PDF konfiguriert werden   4_Die Convert Funktion ermöglicht es alle unterstützten Dateitypen auch interaktiv nach PDF zu konvertieren  5_Aus einem gescannten Image wird eine mit Text hinterlegte durchsuchbare PDF Datei  6_OwnCloud sowie der integrierte PDF Viewer bieten Suchfunktionen um Dokumente üpber deren Inhalt zu suchen  7_Aus Containerdateien wie ZIP oder MSG werden Gesamt PDF mit Bookamrksstrukturen erzeugt

Supported FCpro file-formats:

  • DOC, DOCX, DOCM, RTF, TXT, ODT
  • XLS, XLSX, XLSM
  • PPT, PPTX, PPS, PPSX,
  • FDF, XFDF (Adobe forms),
  • XML
  • PNG, BMP, TIF, TIFF, JPG, JPEG, GIF
  • ZIP, RAR, 7Z,
  • MSG, EML,
  • PDF,
  • HTM, HTML, MHTML,
  • PMTX (PDFMerge)
  • DWG, DXF, DWF
  • Abbyy: PDF, TIF, TIFF, PNG, JPG, JPEG, BMP, GIF, PCX, DCX, JP2, JPC, DJV, DJVU, WDP
  • iOCR:  PDF, TIFF, JPEG, PNG

Unterstützte Output-Formate:

  • PDF
  • PDF/A-1b
  • PDF/A-3b
  • ZUGFeRD

How the conversion should take place can be chosen from the processing profiles which are stored on the FileConverterPro / AutoOCR server. Behind these profiles there can be a whole set of options and settings which not only offer the conversion but, with the FCpro, also additional extended functions e.g.:

  • underlay stationery
  • add stamps or watermarks
  • pagination
  • insert header and footer
  • generate table of content
  • control file-permissions and document-security

Container formats like ZIP, RAR, 7Zip or e-mail container like MSG / EML which can contain multiple files or nested attachements get resolved by the FCpro, converted and merged to an overall-PDF with bookmarks.

Download – OwnCloud App – integration withAutoOCR / FileConverterPro >>>

Caution: The conversion is initiated from the OwnCloud server via “Crown” job with a setable intervall. To shorten the waiting period the, by default set to 15 min., interval should be set to 1 to 5 min.

Android App for FileConverterPro and AutoOCR available

Promotion

There is now also a freely available Android App which works together with our FileConverterPro und AutoOCR – server / service. With it, a multitude of document formats incl. text recognition (OCR) can be converted to searchable PDF or PDF/A files. Via ZIP contaioner also multiple files can be combined to get an overall-PDF from the single-documents (merge).

AfterStart  ProfileSelect  FilePickerNew  Converting  MyFiles

Scope:

  • mobile / tablet as scanner – creates searchable PDF’s
  • create documents mobile – e.g. with Google DOCS – to afterwards provide them with staitionery or watermarks as PDF.
  • combine multiple files as ZIP – creation of an overall-PDF with bookmarks.

Supported input-formats:

  • DOC, DOCX, DOCM, RTF, TXT, ODT
  • XLS, XLSX, XLSM
  • PPT, PPTX, PPS, PPSX,
  • FDF, XFDF (Adobe forms),
  • XML
  • PNG, BMP, TIF, TIFF, JPG, JPEG, GIF
  • ZIP, RAR, 7Z,
  • MSG, EML,
  • PDF,
  • HTM, HTML, MHTML,
  • DWG, DXF, DWF
  • Abbyy: PDF, TIF, TIFF, PNG, JPG, JPEG, BMP, GIF, PCX, DCX, JP2, JPC, DJV, DJVU, WDP
  • iOCR:  PDF, TIFF, JPEG, PNG

Supported output-formats:

  • PDF
  • PDF/A-1b
  • PDF/A-3b
  • ZUGFeRD

How the conversion should be done can be chosen via processing profiles stored on the server, which can be chosen in the app. Behind these profiles there can be a whole set of options and settings which not only control the conversion but also additional extended functions – e.g.:

  • underlay stationery
  • add watermarks and stamps
  • pagination
  • add header and footer
  • create table of contents
  • control rights and document-security

As good addition to our app it makes sense to also install the following apps:

  • Google DOCS / tables – to write Word / Excel files – these then can immidiately be converted to PDF and e.g. underlayed with stationery
  • Scanbot – to quickly and easily scan documents
  • ES Datei Explorer – to easily manage the documents and files and to create ZIP files – single-documents can be merged to an overall-PDF via ZIP

After the installation the app has set our hosted FileConverterPro test-server by default, with which tests with own documents can be run immidiately and without any further expense.

smartphone_googleplay

Android Library for FileConverterPro and AutoOCR available

To be able to quickly and easily develope apps which can interact with our FileConverterPro or AutoOCR servers through the REST interface, we published an Android library. Based on this library also our FileConverterPro Android Appwhich is now available for download on Google Play, was developed. This library can also be used as base for Java applications on other platforms.

Download – Android library for FileConverterPro and AutoOCR >>>

AutoOCR – installation on a different drive

Sometimes there is the request to install AutoOCR to a different drive than “C:”. For that some adjustments were needed which we implemented in AutoOCR beginning from version 1.10.16.

The following functions were implement for it:

  • selection of the installation path via the setup
  • configuration on which drive / folder the Abbyy OCR engine was installed. (TXT file in the installation directory)
  • configuration on which drive / folder Abbyy should store it’s *.tmp files
  • possibility of configuring after how many days (default = 2) the Abbyy *.tmp files will get deleted automatically

Procedure:

1.) Installation of the Abbyy OCR engine – definition of the target directory via commandline parameters:

e.g.: msiexec.exe /i FREngine10R4_x86.msi TARGETDIR=D:\FREngine

2.) Installation of AutoOCR – selection – target directory in the setup:

Setup now allows to select the install folder

3.) Creation of a “FREngine10.txt” file with a single line which contains the path to the Abbyy OCR DLL (FREngine.dll). This file then gets copied into the installation directory of AutoOCR. If this file is available the Abbyy OCR DLL gets searched like stated there – if it is not available the default (C:\Program Files (x86)\Common Files\MAYComputer\OCR10\FREngine.dll) is taken.

4.) Configuration of the folder for the Abbyy *.tmp files:

Configure the temp folder for the Abbyy OCR engine

 5.) Configuration after how many days the Abbyy *.tmp files should be deleted automatically:

AutoOCR - Clear Temp files from Abbyy OCR proessing

Download – AutoOCR – OCR server incl. iOCR engine (ca. 150MB) >>>
Download- Abbyy FineReader 10.x Rel 4 OCR engine setup (ca. 460MB) >>>

AutoOCR version 1.10.12.1 – older Abbyy *.tmp files get deleted automatically

With the current version of the Abbyy OCR engine and with high document come-up it could happen that the *.tmp files created at the processing don’t get deleted and fill up the hard-drive. To solve this problem we implemented a function which automatically deletes the *.tmp files from the folder: c:\windows\temp\Abbyy Finereader Engine 10 which are older than x-days. The default setting is 2 days.

AutoOCR - clear temp folders

Download – AutoOCR – OCR server incl. iOCR engine (ca. 150MB) >>>

For the Abbyy OCR Engine version 10 there are demo licenses available for 30 days or 500 pages – they can be requested from us if you wish to

Download- Abbyy FineReader 10.x Rel 4 OCR Engine setup (ca. 460MB) >>>

Webshop