Category: PDFmdx

PDFmdx Version 3.11.1

Innovations PDFmdx Version 3.11.1:

  • PDF password protection via variables: More and more PDF documents are to be sent by email. However, these often contain information that should not and should not be sent unprotected by email. E.g. Documents from payroll accounting. A password agreed individually with the recipient must be used for such applications. With the new PDFmdx version, variables and thus values that are read from the document can now also be used for the encryption and protection of the PDF. Due to the also new function to use an external CSV files as replacement tables, password lists can also be kept externally, e.g. to use a customer number from the document as a key for password assignment.
  • AES 256 – Encryption: By updating the PDFSecureSign component, it is now possible to protect the PDF files even better with AES 265 encryption.

  • CSV Lookup file: In the field definition there is a function to replace read values. Previously, the values had to be recorded and managed using the PDFmdx editor. Now there is also the possibility to use an external CSV file as a replacement table. <Text>; <Replace with>. e.g. for email addresses or password lists, whereby, for example, the customer number is used as a key. The CSV lookup is added to the replacement table configured via the editor and therefore has a lower priority.

  • ZUGFeRD XML Extraction: If a PDF file to be processed already contains ZUGFeRD XML, there is now a function to recognize this XML and export it to a configurable folder. The name of the XML is generated based on the PDF file created via PDFmdx.

  • EMail Sender via variables: In the email configuration, an individual sender address could previously only be specified for each template. Now it is also possible to assign the sender address for each document / email individually using variables.

  • Conditions for email and printing: So far, the sending of e-mails and the printing of a template could only be activated or deactivated. Now it is also possible to control the sending of e-mails and printouts individually per document via conditions. A condition editor is available for this, such as for recognizing / sharing the documents.

  • Troubleshooting: Configuration changes in the PDFmdx editor were not saved.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx Version 3.9.0

Innovations of PDFmdx Version 3.9.0:

  • Manual backup of templates: Function to trigger the backup of all or individual templates manually. The backup files (* .pmdx) are stored in a configurable folder and differentiated by their name consisting of template / date / time.

 

  • Via conditions, move files to another folder: This means that files that are not to be processed can be sorted out via conditions before the actual PDFmdx processing or “redirected” to another processing folder. A normal template with conditions is used for this. If a condition is met, the file is moved from the incoming folder directly to the configured target folder without further PDFmdx processing.

  • The PDF2PrinterPrint integration has been revised and expanded: The selection of the variables for the control takes place directly via the fields of a template. In addition to the printer, the paper chute can now also be selected via the document content.

  • Variables for the file name of the metadata file (CSV, XLS …): Instead of a fixed name, all available variables can now be used.

  • Vertical dynamic fields can now be used not only for group but also for subgroup fields.

  • Field position is retained and can be restored: If a different PDF template file was selected for a layout, it could happen that the positioning / size of the fields on the template “got lost”. Eg: if the new PDF had fewer pages or fields were removed and The field position and size were not retained. The fields had to be repositioned and the readout area had to be redefined. Now this information is saved in the data structure, even if the field is no longer positioned on a page the field position and size can be restored via “Add area”.

 

  • Store file / folder link in XLS: A new option allows you to store a link in the XLS output file under the columns for “% OUTPATH%”, “% OUTFILENAME%” and “% OUTFOLDER%”. This means that the PDF file or folder can be opened directly by clicking on the cell in the XLS.

  • Update of the PDFCompressor, PDFSign, PDF2PDFA basic components to the current status.
  • The final processing for PDF/A conversion, PDF compression and PDF signature is now no longer in the output folder, but in a temporary folder. The PDF file is only moved to the final destination folder after all processing steps have been completed.
  • Bug fixes: PDFmdx editor: embed email settings for all images, send email, email filter options were not saved, no text could be read from PDF files created via iPaper, metadata file was not generated if no fields were positioned on the layout.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx version 3.8.1 – DataMatrix 2D barcode for Pitney Bowes Relay Inserting System

New features of PDFmdx version 3.8.1:

  • Pitney Bowes DataMatrix barcode:

With the Pitney Bowes Relay inserting system it is possible to automatically insert letters or invoices. The inserter system has a camera to recognize a DataMatrix 2D barcode applied on the page, read it out and use the barcode to control the inserter. The 2D barcode must have a certain structure. It contains a 14-digit identifier of the document, eg. the invoice number, the page number in the document, the number of pages of the document and at the end a counter that must be continuous throughout the document. With this code, the inserter can recognize when a new letter begins and also determine whether a sheet is missing or not in the correct order in the stack.

The stamp variable definition now has its own “Pitney Bowes” checkbox to create such a predefined structure and apply it to the individual pages as a DataMatrix 2D barcode.

The input as well as the output files, sorted by file name, are processed and output as sorted, eg. by the invoice number read from the invoice file. The Merge2Print command line application can then be used to create a sorted PDF total file for the printout. However, due to the requirement that the entire process must be sorted, only the executable EXE processor of PDFmdx, but not the PDFmdx service, can be used. In addition, “block processing” must be activated.

 

Download – 2D barcode specification – Pitney Bowes Relay >>>

  • Start processing via *.rd file:

Previously, PDFmdx processing (executable EXE application or Windows service) could be started either timed (Timer, Date, Daily, Weekly) or by inserting PDF files into a monitored folder. However, there are applications in which it is important that all files are present in the input folder first and only the to start sorted processing. There is now the *.rd option. If this option is activated, the processing starts only if a *.rd file eg. “Ready.rd” is copied to the monitored folder. This allows the processing to be started in a controlled manner at the desired time.

 

Info: As of PDFmdx version 3.8.0 .NET Runtime version 4.5 is required.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx version 3.7.4

New features PDFmdx Editor version 3.7.4:

  • Automatic backup of templates at startup: Activate the function, path for the backups, backups are marked with date and time and replaced by rotation.

  • Search function for conditions: Forward/backward search, full-text search in the conditions. Using the context menu, the layout associated with the condition can be called up and opened directly.

 

  • Comment / separator lines in the condition editor can be deleted or moved up/down.

  • Warning for empty condition nodes: Empty condition nodes can lead to unpredictable results during processing. These are now recognized in the condition editor. A warning is displayed to perform a cleanup.

 

  • NOT for conditions: To be able to reverse the logic of a condition.

  • Extensible fields: For fields of a moving group, not every record may have the same number of rows, and therefore a field fixed in its vertical size may either capture too many or not all rows. With this option, the field can be defined vertically smaller and all subsequent lines to the next record in a field are recorded. The character inserted at the end of each merge line is configurable (space, semicolon, comma).

  • Align the field position and adjust the optimal size: For capturing records of a moving group / subgroup, it is important that the fields are all at a roughly similar vertical position and that the fields are vertically the correct size. The size is optimal if the field vertically just barely captures the text area to read the text, but should not be larger or smaller. It can sometimes not be easy to set the size manually with narrow lines. There is now an automatic function. This function automatically aligns the fields vertically and sets them to the optimal size.

  • Invert area before OCR detection: OCR only works with dark text on a light background. For light writing on a dark background, the area must be inverted before the OCR recognition. There is now a special image processing function that can be activated for a field and executed before the integrated OCR recognition.

  • Always run OCR: Not always does a PDF have the correct text in the text layer. For example, if inverted areas with white text on a black background are present in the document. If “SmartOCR” processing is enabled, an area OCR will only be executed if there is no text in the area. It can now be determined for individual areas that despite existing text, the OCR is always executed, e.g. to perform an inversion of the area beforehand to get a usable result.

 

  • Compound fields: You can now also create fields that are composed of other fields and texts. These fields can be used for the output.

  • Default values for fields can be assigned based on the layout and not just globally.

 

  • Numeric fields can also accept negative values.
  • Create a template without the layouts contained in the template as a new template.

  • Transfer settings of a template to other templates: Selection of the settings tabs of the source template as well as selection of the target templates.

  • Export record filter: Conditions can be used to filter the data record export. Records that meet one of the defined conditions are filtered and not output. Filtered records are displayed in the test function marked “red”. Conditions can be constructed on the basis of text strings, substrings, regular expression or “empty” over fields, layouts and selection level (document, group, subgroup) as well as AND/OR or NOT relationships.

New features PDFmdx Processor version 3.7.4:

  • Call a command line application: After processing all documents from the input area of a job, a command line application can be called. For example, pdfFM to merge files from multiple folders with the same name into a single PDF. If processing takes place via the PDFmdx Windows service, the command line application must not display a dialog and must be executed “silent”.

 

  • Locked files are detected and not processed: If a file to be processed is locked, it can not be processed or moved to an error folder. Such files are marked with a *.lock file and are not further processed. To process such a file later, only the *.lock file has to be deleted.

  • Output – repetition: If a device is not immediately available at the output (share / network drive) or responds too slowly, then the waiting time and the number of repetitions can now be set before the processing recognizes this fact as an error and interrupts the processing.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx – Two-level reading of position data – Product video

In some industries, there are documents where position data has another level. There are documents with 2-stage position data, e.g. in the case of textiles or clothing where an article (number, description) can also have a “sub-level” with sizes or color specifications. The article itself is just listed once and in the level below there are then the quantities/prices of the individual characteristics.

PDFmdx is also able to recognize and read two-stage position data, the following video shows how to do it:

Download – PDFmdx Template Editor & Processor >>>

PDFmdx – Reading position data via a sliding group – Product video

PDFmdx can read information from PDF documents via defined areas and assign them to a field. However, there is also information in a document that occurs several times. For example, position data of invoices – quantity, article number, price etc. These are usually executed as tables in fixed columns and a variable number of rows.

PDFmdx is also able to read position data from PDF documents with the help of “sliding groups”. Fields are assigned to a “sliding group” and positioned on the template document. Criteria define conditions to identify a row as a “sliding group” record. Two delimiters determine in which vertical area of the pages such data records are searched for.

The following video also shows the use of “anchor fields” to find and read information which is “moving” on a page and has no fixed position, e.g. the final amount of an invoice.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx – Recognize, read out and store invoices in a folder structure – Product video available

PDFmdx can recognize PDF Documents (e.g. invoices) via criteria based on textual content, split them into individual documents and read out metadata. The fields and texts that have been read out can be used for the filename as well as for the structured storage or import of the documents.

The following video shows how to do it on the basis of incoming invoices:

Download – PDFmdx Template Editor & Processor >>>

PDFmdx – Split documents via barcode – Product video available

PDFmdx can also recognize documents by barcodes, split them into individual documents and use the read out barcode values to store them. Which barcode is used can be specified via the area, the barcode type or also conditions. The splitting can be done via a change of content or via conditions. Separating sheets containing barcodes can also be deleted.

The following video shows how it works:

Download – PDFmdx Template Editor & Processor >>>

PDFmdx – Read position data via group / subgroup fields

In addition to document fields, PDFmdx can also read position data. Position data is lists or tables with rows and columns. These are typically found on invoices to cite several items in the document. We use the term “sliding group / subgroup. One or more columns (= fields) in on or more rows, on one or more pages, are searched and read in a vertically defined area.

From the PDFmdx version 3.5.0 there is a 2-stage structure where in addition to the groups a subgroup level is also possible. One or more subgroup datasets can be recognized and read out for a group dataset. There are documents with 2-stage position data, eg. in the case of textiles or clothing where an item (number, description) can also have a “sub-level” with sizes or color specifications. The item itself is simply listed and in the level below there are the quantities / prices for individual characteristics.

Two-level readout of position data:

  • “Document/Group/Subgroup” fields define the detection level.

  • An area defined by 2 red horizontal boundary lines will be scanned on all pages of the document for the group (red boxes) and subgroup (green boxes) records.

  • The specified conditions are used to identify and read out the group (G) and subgroup (U) data records.

  • Along with the lowest-level records, the information of the group and document fields is also available.

For tests and as a starting point for your own tests, we have created two example templates with PDF test files. The *.pmdx templates only need to be imported into the PDFmdx Editor via drag&drop and the output path may need to be adjusted. For processing, it is then necessary to create a job with input and error folders in the PDFmdx processor and to select the two test templates for the job.

Download – PDFmdx – Templates and examples for two-level reading of position data >>>
Download – PDFmdx Template Editor & Processor >>>

PDFmdx version 3.5.3 available

New features PDFmdx version 3.5.3:

  • Field / Area OCR / Invert area / Always execute OCR:

Normally for PDFmdx processing, PDF files are used as input, which already contain text – either “normal” PDF or scanned PDF which have received an additional text layer via a previous OCR process (eg. via AutoOCR or FileConverterPro).

PDFmdx also has an integrated OCR function to determine the text in the areas of the positioned fields from the image information.

With the general PDFmdx OCR settings it is possible to specify how the texts from the PDF are to be obtained – “Original”, “OCR” or “SmartOCR”. With “Original” the text is always taken from the PDF, with “OCR” the text is always obtained via a PDFmdx OCR process, even if a text already exists in the PDF. With the “SmartOCR” setting, the PDFmdx OCR function is only executed if there is no text in the PDF, otherwise the existing text in the PDF is taken. These settings generally apply to the entire template and all associated layouts.

In this context, there are now 2 new functions that allow to recognize white text on a black background.

Individual areas with white text on a black background can not be recognized via an automatic OCR process, because before the OCR process the area would have to be inverted in order to be recognized. This can only be done interactively by selecting the area manually.

In the PDFmdx Editor it is now possible to activate the option “Invert Area” in the field configuration. In this case, the field area is inverted for the OCR processing. This creates black text on a white background which can be recognized by the OCR.

There is another new field function “Execute OCR always” with which the general setting “SmartOCR” can be overridden. OCR recognition is then always executed for this field, even if an underlying text already exists.

  

  • PDFmdx Editor – find condition, call layout: There is now a search function to search in the conditions for a (partial) string forward and backward. A line in the conditions can thus be jumped to directly. The linked layout can then be called directly from the condition line. This feature makes it easy to work with a large number of conditions.

  • The web service functions have been revised. In the web service example the metadata can now also be downloaded as XML.
  • For the metadata XML, the new variables JobID, JobName, JobDescription and ProcessID have been added.

Download – PDFmdx Template Editor & Processor >>>