PDFmdx Version 3.18.0

Innovations / improvements PDFmdx Version 3.18.0:

  • Improved text extraction – Thanks to the improved test extraction, we were able to improve the text extraction on the basis of problematic PDFs provided by customers. This also made certain “problematic” PDFs processable – e.g .: phantom spaces were inserted for very small text, the “.” or “,” output shifted, or horizontal dividing lines created by strung together “_” prevented reading in these lines.
  • Completed PDF forms Process – Forms are now “rendered”. The PDF is “flattened” and converted into a normal, no longer changeable PDF. Thus, the fields can be read out and processed by PDFmdx like with any other PDF.
  • New counter variables for the metadata output -% COUNTER_GLOBAL% – A “global” counter for which a start value can be specified. This is increased each time a data record is output. % COUNTER_LOCAL% – the “local” counter, is only incremented for the data records of the current processing and starts with each processing job again at 1.

  • Performance optimization when using the file name to select the layout directly via the name and not via conditions.
  • PDFmdx Editor – Split Test Function – is now based on the same routine as the real processing. Error messages are displayed directly and correspond to the error messages that are also output during processing.
  • PDFmdx Editor – Show fields / sliding groups on other pages – Usually fields are positioned on a specific page of the sample document. However, fields can also be read out on other or more pages, e.g. when reading out position data via sliding groups / subgroups. In order to be able to check how / where the fields are positioned on other pages, or which values ??are read out for “floating fields”, there is now a function to show the positioned fields on other pages. This display is automatically deactivated when you switch to another page. By & lt; CTRL & gt; + & lt; Up / Down Arrow & gt; the fields can also be moved vertically. Along with the & lt; Shift & gt; Key with larger increments. If you click in a displayed field, the field name and the text read out are displayed in the status line.

  • Configuration of the output name – If an illegal character is used in the field for the PDF output file name, e.g. uses a “\”, characters that are not allowed are automatically replaced by a “_” during processing.

  • ZUGFeRD XML via variables – The variables available via PDFmdx can be used to define the ZUGFeRD XML file to be embedded.

  • Combining PDF files – The new “Combine” function allows you to combine additional PDF files with the processed file to form a single PDF.

Combining functions:

    • “Combine” allows you to combine one or more PDFs with the file generated via PDFmdx to form a complete PDF.
    • The order of the files to be merged can be specified via a list.
    • Criteria (AND / OR / NOT) can be used to decide on the basis of texts read from the document whether or not a “merge” should take place.
    • Several different “merge lists” can be created. Such a list can be linked to a specific layout. A layout can be recognized and selected using criteria.
    • In the merge list, you can either use fixed files or paths / names that are dynamically generated during processing using texts read from the document.
  • New implementation of the email function:
    • A new email component has been implemented for sending and email processing.
    • This means that all current encryption protocols (SSL2, SSL3, TLS1, TLS1.1, TLS1.2) are supported. At the beginning of communication with the SMTP email server, the most suitable and supported protocol is automatically negotiated and determined.
    • Max. Configurable size of the email : When sending email, there is an option to combine all emails that are to be sent to the same email address into a single email message. There is now a separate parameter to limit the size of a single combined message, e.g .: to 10MB, If the value for a message is exceeded, it is divided into several emails so that the maximum set size is not exceeded.
    • The PDFmdx processor now has its own logging / protocol function for sending emails. This can be activated individually for the successfully sent as well as for the error mail. The storage path can also be configured.
    • The logged emails are saved as EML files in two configurable folders (Success / Failure). An XML file with metadata and error message is also created for an error email. A separate subfolder can be created for each day to store the successfully sent email.
    • Display of the sent / error email in the PDFmdx processor as a list, accessible via 2 buttons. Functions: sorting of the list by date / time, to, subject, error message, open folder, display email message, open file attachment. Delete individual messages, delete all messages.


Corrections PDFmdx version 3.18.0:

  • Read barcodes were not assigned to variables and therefore not written to the metadata file.
  • Text that has been read out was not prepared according to the configuration for variables in different places.
  • Barcodes could not be read in PDF with page rotation – the display page rotation was not taken into account.
  • “Received value” function did not receive the values ??for the sub-records.
  • MRC-PDF (Mixed Raster Content) could not be processed. A defective PDF was issued.
  • PDFmdx Service Processor – The start at a specified / repeated time did not work.
  • PDFmdx processor – During parallel processing, sometimes it was not possible to write to the metadata XLS (X) because it was still open and therefore blocked by another process.
  • PDFmdx Editor – import of templates – query after overwriting always came even if no file should be overwritten.
  • PDFmdx Editor – First start without template & amp; Layout, then import a template – no automatic refresh was performed and the imported templates & amp; Layouts were not visible.
  • PDFmdx Editor – When moving / copying a layout from one template to another, no query came up that a layout with the same name already existed.
  • PDFmdx Editor – Deleting a layout with multiple linked conditions. The AND / OR nodes were retained in the editor and were not automatically cleaned up, which caused processing problems.
  • PDFmdx Editor – Images in the HTML body were not inserted into the email when the email was merged to the same recipient and were missing in the sent message.
  • Sending emails – If an error occurred when sending emails under certain conditions, processing was aborted or the service was stopped. For this reason, the sending of emails has been re-implemented and logging of successful and error emails has been added. (see innovations).

Download – PDFmdx Template Editor & Processor >>>