Color scans usually produce quite large files. At 300dpi, color, JPEG compression requires approx. 300kB of storage space per page. In order to create the smallest, compact PDF output files possible, the JPEG2000 compression for AutoOCR / iOCR has been improved and an additional parameter has been added. This JPEG2000 compression allows the size of the color images contained in the PDF to be reduced considerably, making the searchable PDF files considerably smaller. The JPEG2000 compression has no influence on the OCR recognition rate.
With JPEG2000, both “lossless” and “lossy” compression are available. Normally one should use the “lossy” (lossy) JPEG2000 compression to create small files – there is an additional parameter (ratio: 1 to 999) with which the compression rate and thus the size and visual quality can be controlled.
In the following table a test was made with different settings for the JPEG / JPEG2000 compression to see what effects these parameters have on the PDF file size. A scan, 300 dpi, 24 bit color, JPEG compression, 7 pages with 2082 kB was used as the initial file.
This shows that with JPEG2000, depending on the parameters, you can achieve a file size reduction of between 30 and 80%.
- JPEG2000 / lossy / 75-100 = high quality / larger files – 32-49% reduction
- JPEG2000 / lossy / 125-150 = medium quality / medium file size – 59-65% reduction
- JPEG2000 / lossy / 200 – 250 = low quality / small files- 74-79% reduction
Download – AutoOCR – OCR Server inkl. OmniPage OCR (ca. 640MB) >>>