Category: PDFmdx

PDFmdx Version 3.20.0

News / improvements PDFmdx version 3.20.0:

  • Reps & Waiting time for locked files: If a file in one of the monitored input folders is still locked when another application starts processing, a *.lock file is automatically created for this file. This happens if e.g. a scanner stores the scans directly in a watched folder. Often the PDF file is created first, locked and then added page by page. Depending on the size of the batch of documents, it can take a few minutes to finish writing a file and unlocking it. In order to intercept such situations, 2 parameters were implemented to automatically remove *.lock files again. This allows initially locked files to be processed automatically after they have been released.

The “Number of repetitions” and the “Waiting time between repetitions” can be configured. It is therefore repeatedly checked whether the file is still locked and if it has been released, the *.lock file is deleted and the PDF is processed.

  • Open PDF password – XML export of templates / layouts: During the XML export of templates / layouts, the passwords stored in the PDFmdx layout can now also be exported to open a protected PDF file.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx Version 3.19.0

Innovations / improvements PDFmdx version 3.19.0:

  • Improved barcode performance through preprocessing: There is now a new option to speed up barcode recognition and processing. The previous barcode processing is still available because there may be use cases where the previous processing offers an advantage. Both implementations have their advantages depending on the situation. Barcode recognition requires an image and not a normal PDF structure with text and lines. The PDF must therefore be rendered beforehand for barcode recognition.

In the previous implementation, only the area marked by the field is rendered. Since this doesn’t take very long and in order to achieve a better result even with bad scans, the rendering is carried out three times – with 200, 300, and 400dpi. Advantage: If barcode recognition is only to take place in a single small area, not everything has to be rendered or if scans are processed with inferior quality.

With the new “Barcode preprocessing” function, the entire PDF is rendered in advance with 300dpi, all barcode types defined in the template are recognized and used later cached. This works faster because the redering or barcode recognition is simply carried out with 300dpi and the recognized values ??are saved for further processing. Advantage: If there are several areas with barcodes on a page or if the document is of good quality and multiple rendering is not required.

  • Process protected PDF – Open password: Previously, password protected PDF could not be processed. Now there is a function to store the “Open PDF” password in the layout. All passwords stored for the job for the selected layouts are loaded for processing. If PDFmdx recognizes a protected PDF file, the password list tries to open and decrypt the PDF one after the other. If a password matches, the PDF is opened and decrypted. A new, unprotected PDF is then created from it and processed normally via PDFmdx.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx Version 3.18.0

Innovations / improvements PDFmdx Version 3.18.0:

  • Improved text extraction – Thanks to the improved test extraction, we were able to improve the text extraction on the basis of problematic PDFs provided by customers. This also made certain “problematic” PDFs processable – e.g .: phantom spaces were inserted for very small text, the “.” or “,” output shifted, or horizontal dividing lines created by strung together “_” prevented reading in these lines.
  • Completed PDF forms Process – Forms are now “rendered”. The PDF is “flattened” and converted into a normal, no longer changeable PDF. Thus, the fields can be read out and processed by PDFmdx like with any other PDF.
  • New counter variables for the metadata output -% COUNTER_GLOBAL% – A “global” counter for which a start value can be specified. This is increased each time a data record is output. % COUNTER_LOCAL% – the “local” counter, is only incremented for the data records of the current processing and starts with each processing job again at 1.

  • Performance optimization when using the file name to select the layout directly via the name and not via conditions.
  • PDFmdx Editor – Split Test Function – is now based on the same routine as the real processing. Error messages are displayed directly and correspond to the error messages that are also output during processing.
  • PDFmdx Editor – Show fields / sliding groups on other pages – Usually fields are positioned on a specific page of the sample document. However, fields can also be read out on other or more pages, e.g. when reading out position data via sliding groups / subgroups. In order to be able to check how / where the fields are positioned on other pages, or which values ??are read out for “floating fields”, there is now a function to show the positioned fields on other pages. This display is automatically deactivated when you switch to another page. By & lt; CTRL & gt; + & lt; Up / Down Arrow & gt; the fields can also be moved vertically. Along with the & lt; Shift & gt; Key with larger increments. If you click in a displayed field, the field name and the text read out are displayed in the status line.

  • Configuration of the output name – If an illegal character is used in the field for the PDF output file name, e.g. uses a “\”, characters that are not allowed are automatically replaced by a “_” during processing.

  • ZUGFeRD XML via variables – The variables available via PDFmdx can be used to define the ZUGFeRD XML file to be embedded.

  • Combining PDF files – The new “Combine” function allows you to combine additional PDF files with the processed file to form a single PDF.

Combining functions:

    • “Combine” allows you to combine one or more PDFs with the file generated via PDFmdx to form a complete PDF.
    • The order of the files to be merged can be specified via a list.
    • Criteria (AND / OR / NOT) can be used to decide on the basis of texts read from the document whether or not a “merge” should take place.
    • Several different “merge lists” can be created. Such a list can be linked to a specific layout. A layout can be recognized and selected using criteria.
    • In the merge list, you can either use fixed files or paths / names that are dynamically generated during processing using texts read from the document.
  • New implementation of the email function:
    • A new email component has been implemented for sending and email processing.
    • This means that all current encryption protocols (SSL2, SSL3, TLS1, TLS1.1, TLS1.2) are supported. At the beginning of communication with the SMTP email server, the most suitable and supported protocol is automatically negotiated and determined.
    • Max. Configurable size of the email : When sending email, there is an option to combine all emails that are to be sent to the same email address into a single email message. There is now a separate parameter to limit the size of a single combined message, e.g .: to 10MB, If the value for a message is exceeded, it is divided into several emails so that the maximum set size is not exceeded.
    • The PDFmdx processor now has its own logging / protocol function for sending emails. This can be activated individually for the successfully sent as well as for the error mail. The storage path can also be configured.
    • The logged emails are saved as EML files in two configurable folders (Success / Failure). An XML file with metadata and error message is also created for an error email. A separate subfolder can be created for each day to store the successfully sent email.
    • Display of the sent / error email in the PDFmdx processor as a list, accessible via 2 buttons. Functions: sorting of the list by date / time, to, subject, error message, open folder, display email message, open file attachment. Delete individual messages, delete all messages.

 

Corrections PDFmdx version 3.18.0:

  • Read barcodes were not assigned to variables and therefore not written to the metadata file.
  • Text that has been read out was not prepared according to the configuration for variables in different places.
  • Barcodes could not be read in PDF with page rotation – the display page rotation was not taken into account.
  • “Received value” function did not receive the values ??for the sub-records.
  • MRC-PDF (Mixed Raster Content) could not be processed. A defective PDF was issued.
  • PDFmdx Service Processor – The start at a specified / repeated time did not work.
  • PDFmdx processor – During parallel processing, sometimes it was not possible to write to the metadata XLS (X) because it was still open and therefore blocked by another process.
  • PDFmdx Editor – import of templates – query after overwriting always came even if no file should be overwritten.
  • PDFmdx Editor – First start without template & amp; Layout, then import a template – no automatic refresh was performed and the imported templates & amp; Layouts were not visible.
  • PDFmdx Editor – When moving / copying a layout from one template to another, no query came up that a layout with the same name already existed.
  • PDFmdx Editor – Deleting a layout with multiple linked conditions. The AND / OR nodes were retained in the editor and were not automatically cleaned up, which caused processing problems.
  • PDFmdx Editor – Images in the HTML body were not inserted into the email when the email was merged to the same recipient and were missing in the sent message.
  • Sending emails – If an error occurred when sending emails under certain conditions, processing was aborted or the service was stopped. For this reason, the sending of emails has been re-implemented and logging of successful and error emails has been added. (see innovations).

Download – PDFmdx Template Editor & Processor >>>

PDFmdx Version 3.16.6

New features PDFmdx template editor:

  • Templates / template groups: An additional “group” level has been implemented to enable clarity and management of a large number of templates. This allows templates to be grouped together. In the PDFmdx Editor, the groups form the top level in the tree view. Groups can – be created, the name changed, deleted or moved up and down. Templates of a selected group can be exported or imported as a whole or individually. Templates in a group can be moved to a different or a new group.

 

  • E-Mail – Configuration via profiles: Several email configurations can now be created in one template. The e-mail configurations are managed using profiles with names. E-mail profiles can be created, copied, renamed and deleted. A specific email profile can be assigned to the email conditions. This means that not only the e-mail function itself, but also the e-mail profile to be used can be controlled via conditions.

 

  • E-Mail – Use of variables for attachments: Up to now, only fix selectable and therefore predefined attachments could be sent with the PDFmdx main document by email. Now it is also possible to use variables for the path and file name of the additional attachments. An option can be used to determine whether, if an attachment does not exist or cannot be found, the email is sent anyway, or whether it should be treated as an error. During configuration, a file is first selected, inserted into the list, then edited and the variables added.

  • E-Mail – Main document always as the first appendix: When sending an e-mail, additional attachments can be sent in addition to the main document generated via PDFmdx. Now the main document generated by PDFmdx is always inserted as the first attachment, the other additional attachments are always inserted after.
  • Sliding group over 2 pages: With some documents it can happen that a data record of a sliding group begins on one page and continues on the next page. Up to now, sliding groups have only been identified via a condition (DG) for the start of the group. The data record is recognized, but cannot be read out completely, as part of it is on the following page. In order to include the information on the following page for such cases, at least two “DG” conditions with OR / OR link are required. One for the beginning and one for the rest of the record on the next page. This means that the entire data record is recognized and processed, even if it is on separate pages.

  • Metadaten as XLS: Some downstream applications can only process the old XLS and not the current XLSX format. That is why the XLS format for the metadata output has now been implemented again.
  • New PDF preview functions – display text layer / display text blocks: A scanned PDF that has been made searchable via OCR consists of both an image and a text layer. The image plane shows that only the scanned document. From this point of view it cannot be seen whether, where or which text is behind it. In the PDFmdx Editor preview there is now the option of showing only the text layer and hiding the image layer, as well as the option of showing the boundaries of the text blocks as green rectangles for each of these displays. This makes it easy and quick to see which text is available and where exactly the text blocks are located.

 

  • Fields / Variables – Replacement of values via XLSX table: Bisher konnte nur eine CSV Datei verwendet werden um die Werte von Feldern/Variablen zu ersetzen. Jetzt ist es auch möglich eine XLSX Datei zu verwenden. Konfiguriert wird der Name der Spalte für den Suchschlüssel sowie die Spalte für den Ersatzwert.

  • Counter as a stamp variable: There is now also a counter variable for text or barcode stamps. An option can be used to configure whether the counter should be increased when it is used several times within the document or only with each new document. The start value can also be specified.

 

  • Visual PDF signature – support for transparent image files: With the current version, transparent image files (PNG, TIFF) can also be used for the visual representation of the PDF signature.
  • Export templates as PMDX individual data: Previously, several or all templates could only be exported as a single PMDX file. or if you wanted to export several or specific templates as individual files, this had to be done individually for each template. Now, via an option from a template group, each template contained in the group can also be exported individually as a separate PMDX file. The template name is used as the file name.

 

Innovations PDFmdx processing:

  • Recognition and export of a ZUGFeRD XML file contained in the PDF: This option can be activated for the PDFmdx job. Each PDF file to be processed is checked in advance to determine whether it contains a ZUGFeRD XML file. If such an XML is contained in the PDF as an attachment, the XML is extracted from the PDF and saved in the configured folder under the PDF name with the extension * .XML. If the option “Move PDF” has been activated, the PDF will also be moved to this folder and will not be processed by PDFmdx.

  • Activate all layouts automatically: If a new layout is added to a template, this layout must also be activated in the PDFmdx processor during the job. Up until now, this had to be done manually and was sometimes “forgotten” so that the newly created layout is not recognized. For a template as well as for the job configuration there is now an option to activate all layouts automatically. This means that all newly added layouts are always automatically activated and can no longer be “forgotten”.

 

  • Error log with header: In order to be able to interpret the content of the individual columns of the error log (Error.csv) more easily, a header line with the names of the columns was inserted to get a conclusion about the information of the fields.

 

PDFmdx corrections:

  • E-Mail HTML Body – File names for images can now contain spaces.
  • Anchor field not found – Linked fields now remain empty and are not assigned any values.
  • Fuzzy anchor search function has been improved.
  • Error message – XLSX blocked: Over time, an XLSX metadata file can become very large. It has happened that opening and writing this XLSX took longer and several processes wanted to access this file. This led to error messages, access was not possible because the XLSX was still blocked by another process. This problem has been resolved. Nevertheless, care should be taken to ensure that such metadata files do not become unnecessarily large.
  • Sliding group – Recognition was not possible with PDF with “Display Rotation”.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx Version 3.13.2

New features PDFmdx template editor:

  • Split templates: Templates can contain a large number of layouts and conditions over time. E.g. if a template contains several hundred layouts, the processing of these becomes slow and confusing. For this it is advisable to use several templates and to divide the layouts. It makes no difference in terms of processing. There is now a separate function for dividing up existing templates. The number of layouts per template is specified. The PDFmdx Editor then automatically generates copies of templates and divides the layouts and the associated conditions. After that, each template only contains the specified number of layouts.

 

  • Copy / move layouts: Together with the “Split templates” function, the function for copying or moving layouts to one or more other templates has been expanded and implemented. So far it was only possible to copy one layout into another template. The layouts (one or more) and the target templates (one or more) can now be selected in a template. It is also possible to move the selected layouts and not just copy them. The fields and their position in the layout are retained or added to the target template.

  • Read out partial areas of fields: When reading out field contents, there is now a new function to get a specific part of a text in a targeted and simple manner. E.g. if a single field contains all information and is separated by a “/” separator. e.g. “XKEY GmbH \ Gerstlgasse30 \ 1210 \ Vienna”. With the help of the newly implemented Regex function, by specifying “# SPLIT # \” plus the position in the string, you can configure which part is to be read out and used for the assignment of the variables. For example the zip code = 1210 can be read out and determined by specifying “# SPLIT # \” + “3”.

      

  • Configurable page limit for processing: The information required for processing is often only found on the first or first page. In order to speed up the processing of very extensive PDF documents, which can also contain several hundred pages, a page limit (e.g. 2) can now be set for the template. This defines that only the specified pages and not always all pages of the document are read in and processed.

  • Condition editor
    • Individual conditions, substructures but also the entire condition tree can be copied from one template to another using the clipboard.
    • Separator lines can be copied / cut and pasted at any point in the structure.
    • (M) emory function to use an existing condition as a preliminary consideration for all newly added conditions. M – sets the currently selected condition as default, C – deletes this default again. This preset is template-specific and is saved and restored with it.
    • Checkbox to determine whether a new condition to be created is inserted at the beginning or at the end of the current level in the tree structure. Previously, a new condition was always inserted at the beginning of a node level, which meant that with large tree structures it was always necessary to move the condition downwards to get back to the starting point.

   

  • Do not use letterhead when printing / sending emails: If documents are output on a printer or sent as email, it may be necessary not to use the letterhead for printing because the printer already contains letterhead, but the email should be sent with letterhead. This option can be specifically controlled.

  • Combine individual documents in a sorted manner: In order not to have to pay attention to a certain order / sorting when entering documents, or to combine individual documents sorted into an overall document using the field content that has been read out, 2 new functions – “Append” and “Insert sorted” have been added to the Output configuration implemented.
    • “Attach” – If a PDF with the same name is found during the output, the new file is appended to the end of the existing file.
    • “Insert sorted” – During the configuration, a field must be selected according to which the sorting should take place. A PDF bookmark is created with the text of the selected sort field. If a file with the same name is found in a subsequent output, the new document is inserted or appended in the correct place in the PDF using the sort field. “Empty” or the same content is added at the end.

 

  • PDF display rotation is taken into account: PDF files can contain (0,90,180,270) parameters via a “display rotation”. E.g .: Can the display of documents that have been scanned rotated be corrected accordingly using the “Display Rotation” parameter in order to always display the pages in portrait format. However, the parameter is only used for display on the screen; internally, however, the PDF data structure is still rotated (e.g. upside down). The current version of PDFmdx recognizes the PDF “Display Rotation” and takes it into account so that the display corresponds to the processing and the fields are read from the correct positions.
  • EasyArchiv (IMP) Export Format: The EasyArchiv IMP metadata output format is a kind of CSV format. It contains an individually configurable header line as well as subsequent lines with the PDFmdx metadata. The field delimitation is the “^” and the field separator is the “,” (comma). As with all other formats, fields / variables are available for selection for the following lines.

Example of an IMP header: @ FOLDER, FT: B2B_Netz, FN: Partner, FN: B2BMessageID, FN: MailMessageID, FN: RefNr, FN: Sender, FN: B2BSystem, BI: 2001
FT: = document type, FN: = field names. This line must be configured individually according to the archive into which it is to be imported.

  • OCR reliability in the OCR area: PDFmdx also offers the option of determining the text from the image using an OCR function for the positioned fields. Up to now, the OCR reliability was fixed internally at 60%. This threshold value can now be configured and is also output as information in the preview for the text output in the footer of the PDFmdx editor.

 

Innovations PDFmdx processing:

      • Job Trigger Funktion:  The start of processing of one or more jobs can be triggered by the end of processing of another job. All jobs that are started via a trigger must be deactivated in the job list, otherwise processing is triggered via the Monitoring folder and not via the trigger. With these jobs, the start of processing is only triggered by the trigger of another job. Sorted processing can also be ensured by a trigger. The next processing step is only started after all files of a previous job have been processed.

      • Sorted processing of incoming files: Files in monitored folders can now also be processed in a sorted (ascending / descending) manner according to name, size, creation and modification date. To do this, the “Block processing” option must be activated. The start of sorted processing requires a defined point in time. It can be triggered by the interval timer, at a set time, by an * .rd file, by the trigger of another job or by pressing “Start processing”.

      • Error email address per job: An individual error email address can be specified for each job. This overrides the error email address that is generally defined in the PDFmdx processor and applies to all jobs.

      • Parallel processing: Is carried out on the basis of the jobs, but not in order to process several documents in parallel within one job.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx Version 3.11.1

Innovations PDFmdx Version 3.11.1:

  • PDF password protection via variables: More and more PDF documents are to be sent by email. However, these often contain information that should not and should not be sent unprotected by email. E.g. Documents from payroll accounting. A password agreed individually with the recipient must be used for such applications. With the new PDFmdx version, variables and thus values that are read from the document can now also be used for the encryption and protection of the PDF. Due to the also new function to use an external CSV files as replacement tables, password lists can also be kept externally, e.g. to use a customer number from the document as a key for password assignment.
  • AES 256 – Encryption: By updating the PDFSecureSign component, it is now possible to protect the PDF files even better with AES 265 encryption.

  • CSV Lookup file: In the field definition there is a function to replace read values. Previously, the values had to be recorded and managed using the PDFmdx editor. Now there is also the possibility to use an external CSV file as a replacement table. <Text>; <Replace with>. e.g. for email addresses or password lists, whereby, for example, the customer number is used as a key. The CSV lookup is added to the replacement table configured via the editor and therefore has a lower priority.

  • ZUGFeRD XML Extraction: If a PDF file to be processed already contains ZUGFeRD XML, there is now a function to recognize this XML and export it to a configurable folder. The name of the XML is generated based on the PDF file created via PDFmdx.

  • EMail Sender via variables: In the email configuration, an individual sender address could previously only be specified for each template. Now it is also possible to assign the sender address for each document / email individually using variables.

  • Conditions for email and printing: So far, the sending of e-mails and the printing of a template could only be activated or deactivated. Now it is also possible to control the sending of e-mails and printouts individually per document via conditions. A condition editor is available for this, such as for recognizing / sharing the documents.

  • Troubleshooting: Configuration changes in the PDFmdx editor were not saved.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx Version 3.9.0

Innovations of PDFmdx Version 3.9.0:

  • Manual backup of templates: Function to trigger the backup of all or individual templates manually. The backup files (* .pmdx) are stored in a configurable folder and differentiated by their name consisting of template / date / time.

 

  • Via conditions, move files to another folder: This means that files that are not to be processed can be sorted out via conditions before the actual PDFmdx processing or “redirected” to another processing folder. A normal template with conditions is used for this. If a condition is met, the file is moved from the incoming folder directly to the configured target folder without further PDFmdx processing.

  • The PDF2PrinterPrint integration has been revised and expanded: The selection of the variables for the control takes place directly via the fields of a template. In addition to the printer, the paper chute can now also be selected via the document content.

  • Variables for the file name of the metadata file (CSV, XLS …): Instead of a fixed name, all available variables can now be used.

  • Vertical dynamic fields can now be used not only for group but also for subgroup fields.

  • Field position is retained and can be restored: If a different PDF template file was selected for a layout, it could happen that the positioning / size of the fields on the template “got lost”. Eg: if the new PDF had fewer pages or fields were removed and The field position and size were not retained. The fields had to be repositioned and the readout area had to be redefined. Now this information is saved in the data structure, even if the field is no longer positioned on a page the field position and size can be restored via “Add area”.

 

  • Store file / folder link in XLS: A new option allows you to store a link in the XLS output file under the columns for “% OUTPATH%”, “% OUTFILENAME%” and “% OUTFOLDER%”. This means that the PDF file or folder can be opened directly by clicking on the cell in the XLS.

  • Update of the PDFCompressor, PDFSign, PDF2PDFA basic components to the current status.
  • The final processing for PDF/A conversion, PDF compression and PDF signature is now no longer in the output folder, but in a temporary folder. The PDF file is only moved to the final destination folder after all processing steps have been completed.
  • Bug fixes: PDFmdx editor: embed email settings for all images, send email, email filter options were not saved, no text could be read from PDF files created via iPaper, metadata file was not generated if no fields were positioned on the layout.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx version 3.8.1 – DataMatrix 2D barcode for Pitney Bowes Relay Inserting System

New features of PDFmdx version 3.8.1:

  • Pitney Bowes DataMatrix barcode:

With the Pitney Bowes Relay inserting system it is possible to automatically insert letters or invoices. The inserter system has a camera to recognize a DataMatrix 2D barcode applied on the page, read it out and use the barcode to control the inserter. The 2D barcode must have a certain structure. It contains a 14-digit identifier of the document, eg. the invoice number, the page number in the document, the number of pages of the document and at the end a counter that must be continuous throughout the document. With this code, the inserter can recognize when a new letter begins and also determine whether a sheet is missing or not in the correct order in the stack.

The stamp variable definition now has its own “Pitney Bowes” checkbox to create such a predefined structure and apply it to the individual pages as a DataMatrix 2D barcode.

The input as well as the output files, sorted by file name, are processed and output as sorted, eg. by the invoice number read from the invoice file. The Merge2Print command line application can then be used to create a sorted PDF total file for the printout. However, due to the requirement that the entire process must be sorted, only the executable EXE processor of PDFmdx, but not the PDFmdx service, can be used. In addition, “block processing” must be activated.

 

Download – 2D barcode specification – Pitney Bowes Relay >>>

  • Start processing via *.rd file:

Previously, PDFmdx processing (executable EXE application or Windows service) could be started either timed (Timer, Date, Daily, Weekly) or by inserting PDF files into a monitored folder. However, there are applications in which it is important that all files are present in the input folder first and only the to start sorted processing. There is now the *.rd option. If this option is activated, the processing starts only if a *.rd file eg. “Ready.rd” is copied to the monitored folder. This allows the processing to be started in a controlled manner at the desired time.

 

Info: As of PDFmdx version 3.8.0 .NET Runtime version 4.5 is required.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx version 3.7.4

New features PDFmdx Editor version 3.7.4:

  • Automatic backup of templates at startup: Activate the function, path for the backups, backups are marked with date and time and replaced by rotation.

  • Search function for conditions: Forward/backward search, full-text search in the conditions. Using the context menu, the layout associated with the condition can be called up and opened directly.

 

  • Comment / separator lines in the condition editor can be deleted or moved up/down.

  • Warning for empty condition nodes: Empty condition nodes can lead to unpredictable results during processing. These are now recognized in the condition editor. A warning is displayed to perform a cleanup.

 

  • NOT for conditions: To be able to reverse the logic of a condition.

  • Extensible fields: For fields of a moving group, not every record may have the same number of rows, and therefore a field fixed in its vertical size may either capture too many or not all rows. With this option, the field can be defined vertically smaller and all subsequent lines to the next record in a field are recorded. The character inserted at the end of each merge line is configurable (space, semicolon, comma).

  • Align the field position and adjust the optimal size: For capturing records of a moving group / subgroup, it is important that the fields are all at a roughly similar vertical position and that the fields are vertically the correct size. The size is optimal if the field vertically just barely captures the text area to read the text, but should not be larger or smaller. It can sometimes not be easy to set the size manually with narrow lines. There is now an automatic function. This function automatically aligns the fields vertically and sets them to the optimal size.

  • Invert area before OCR detection: OCR only works with dark text on a light background. For light writing on a dark background, the area must be inverted before the OCR recognition. There is now a special image processing function that can be activated for a field and executed before the integrated OCR recognition.

  • Always run OCR: Not always does a PDF have the correct text in the text layer. For example, if inverted areas with white text on a black background are present in the document. If “SmartOCR” processing is enabled, an area OCR will only be executed if there is no text in the area. It can now be determined for individual areas that despite existing text, the OCR is always executed, e.g. to perform an inversion of the area beforehand to get a usable result.

 

  • Compound fields: You can now also create fields that are composed of other fields and texts. These fields can be used for the output.

  • Default values for fields can be assigned based on the layout and not just globally.

 

  • Numeric fields can also accept negative values.
  • Create a template without the layouts contained in the template as a new template.

  • Transfer settings of a template to other templates: Selection of the settings tabs of the source template as well as selection of the target templates.

  • Export record filter: Conditions can be used to filter the data record export. Records that meet one of the defined conditions are filtered and not output. Filtered records are displayed in the test function marked “red”. Conditions can be constructed on the basis of text strings, substrings, regular expression or “empty” over fields, layouts and selection level (document, group, subgroup) as well as AND/OR or NOT relationships.

New features PDFmdx Processor version 3.7.4:

  • Call a command line application: After processing all documents from the input area of a job, a command line application can be called. For example, pdfFM to merge files from multiple folders with the same name into a single PDF. If processing takes place via the PDFmdx Windows service, the command line application must not display a dialog and must be executed “silent”.

 

  • Locked files are detected and not processed: If a file to be processed is locked, it can not be processed or moved to an error folder. Such files are marked with a *.lock file and are not further processed. To process such a file later, only the *.lock file has to be deleted.

  • Output – repetition: If a device is not immediately available at the output (share / network drive) or responds too slowly, then the waiting time and the number of repetitions can now be set before the processing recognizes this fact as an error and interrupts the processing.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx – Two-level reading of position data – Product video

In some industries, there are documents where position data has another level. There are documents with 2-stage position data, e.g. in the case of textiles or clothing where an article (number, description) can also have a “sub-level” with sizes or color specifications. The article itself is just listed once and in the level below there are then the quantities/prices of the individual characteristics.

PDFmdx is also able to recognize and read two-stage position data, the following video shows how to do it:

Download – PDFmdx Template Editor & Processor >>>

Webshop