PDFmdx Version 3.3.0 available

Innovations PDFmdx Version 3.3.0:

  • Export additional formats – By integrating the PDF2DOCX Converter, HTML, DOCX, XML, TXT and XLS can now be created in addition to the generated PDF. These additional filed are created from the generated PDF and stored in the same output path as the PDF. One or more additional file formats can be created at the same time.

  • PDFmdx Editor – Save and load the conditions created in the editor as an XML file to easily and quickly save and reload different states of conditions. The file name is automatically suggested when saving on basis of – template name, date and time.

  • PDFmdx Editor – Move conditions up / down or to the beginning / end. This allows conditions to be easily resorted, grouped and to align related rows underneath each other.

 

  • PDFmdx Editor – Conditions – insert / rename a seperator. Conditions can be provided with additional dividing lines to increase the readability and clarity of large structures. An inserted dividing line can be removed and the text can be edited.

  • Error correction – An action associated with a condition – Detect, Split, Delete, Sliding Groups – can be limited to specific pages. For example, only on the first or on the first and second page. This speeds up processing, because not all pages in a batch have to be processed. Fixed an issue where the page limit specification was not applied and all pages were searched. With version 3.3.0, only the specified pages are processed.

  • Keep field contents from deleted pages. If pages are deleted via conditions it was not possible to use the field information from these pages for conditions, for the output of the metadata or for the creation of the path and file name. For example, to use a barcode value of a cover page as a document identifier, for separating a batch, for selecting the layout, for the file name, and finally for deleting this seperator page. In order to preserve field contents despite the deletion of pages, the field definition now has the option “Persistant value”. This makes it possible to identify a layout, divide the stack, delete the pages and use the read-out value for the file name in a single condition and a single step.

  • PDFmdx Editor – Save template / layout structure as XML. The tree structure of the templates and layouts created in the PDFmdx Editor can be written to an XML file and automatically updated when the PDFmdx Editor is closed.

  • PDFmdx Editor – New field type – „Filename“ – Thus, the file name of the input file can be used for the conditions of processing and layout recognition. For example, the layout to be used can be controlled by the file name or parts of the name.

    

  • PDFmdx Editor – Conditions – Direct selection of the layout to be used via option <VALUE>. If you want to select a layout via the value of a variable (for example the file name), so either a seperate condition must be created for each layout and linked with “OR”, or you can use the selection <VALUE> under the conditions. This automatically checks the given variable against every layout name created for the template and selects the layout in which the layout name matches the content of the field.

 

  • %FILENAME% variable – The case of the file name is preserved – previously the file name was always converted to lowercase.
  • Overwrite file / Append counter – There is now an option to overwrite files with the same name during processing. If this option is not checked then a new file will be created as usual and a counter will be added to the existing file name.

Download – PDFmdx Template Editor & Processor >>>

iPaper 3.x – MDX Option – Product Video available – Read content and use as variables

For iPaper version 3.x there is the “MDX – MetaDataXtraction” add-on module. Key features of the PDFmdx application have been integrated into iPaper. Documents can be recognized on the basis of content, the corresponding stationery can be selected, or field, template and layout definitions can be used to read out information from the document. Fields / variables are filled with values which can be used later on in the iPaper actions. Fixed information or information read from the document can also be “stamped” on the PDF as text or as 1D / 2D / QR barcode.

iPaper MDX applications:

  • Automatically select the stationery to be used on the basis of the document contents.
  • For serial letters or document stacks it can be recognized at which page a new document begins to select the stationery again or to start again with the first stationery page.
  • Read e-mail addresses from the document and use them to send the document immediately.
  • Documents can be recognized on the basis of criteria, fields can be read out of the document via layout masks, variables can be assigned and used for iPaper actions such as e.g. e-mail, save as, program call and so on.
  • QR code barcodes (e.g. For quick transfers), 1D / 2D barcodes or text stamps can be applied to vouchers. It is also possible to assign read field contents from the document.

iPaper MDX Product Video – Read content and use as file name:

PDFmdx Version 3.2.7 available

Innovations PDFmdx Version 3.2.7:

  • Multiline Edit Box for Barcode- and Text-stamp – Create QR code for payment instructionsUp to now, only a single-line string could be specified for the text and barcode stamping. CR / LF was not consideredNow there is a multi-line input field for capturing the texts. Line breaks (CR / LF) and blank lines are transferred correctly to the stamps and barcodesNow QR codes can also be generated for the creation of SEPA payment instructions – See QR-Code “Zahlen mit Code”The basis for this QR code is a standard of the European Payments Council. Many banks offer eBanking apps for smartphones with the functionality of which such QR code can be read. The information is automatically transferred to a transfer.

    

  • Identify the same receiversUp to now, every PDF file created could only be sent in a separate email message. Now it is also possible, when processing a job, to collect all documents with the same recipient address and send in only one message. The recipient receives an email, which contains all documents, instead of several mails with only one attachment.

  • Remove charactersSo far there have been only the function to remove at the beginning and end of a field read certain characters. Now there is also the possibility to remove one or more fixed characters from the whole extracted string – no matter where they are.

  • Replace several characters at onceThere was already the function to define several characters which should be replaced.However, the function was not executed “one at a time” but one after the other. Thus, for example, not possible to convert 1.234.56 to 1.234.56. This has been changed and the function is executed with all defined replacement characters at once, which makes such conversions possible now.

  • XLSX instead of XLS – as well as sheet name configurableThe MS-Excel XLS format has been replaced by the XLSX format. The sheet name can now be assigned freely. Previously, the sheet name in the XLS was fixed with “PDFmdx” fixed.

  • Run Job weekly – Time-controlled execution of a job – In addition to the “Daily” option, there is now also the option “Weekly”

 

  • Email Address Search – Document / Page – Troubleshooting – In addition to reading e-mail addresses via fields, there is also the possibility to search all e-mail addresses from the document or on certain pages and to use it for sending.

  • HTML Body – Embed images –  Troubleshooting HTML EMail Sharing – For some EMail Clients / Web-based EMail services (eg Web.de), if images were embedded in the body, the message was displayed as HTML code / text and thus not correctly displayed .

Download – PDFmdx Template Editor & Processor >>>

PDFmdx-CL Version 1.0.25 – Commandline application available for PDFmdx

PDFmdx-CL is a command line application that allows to transfer PDF documents or whole folder structures to a PDFmdx service via the Web service interface and to store the results of the processing in a target folder.

PDFmdx-CL is a free add-on for the PDFmdx server, can be installed on any MS Windows workstations and requires no additional licensing.

PDFmdx-CL scope of application:

  • recognize PDF documents across fields and their contents by means of stored criteria
  • Split of document stacks into single documents by criteria
  • Read out field information from the documents and write it as a metadata (ASCII-TXT) file
  • PDF stationery underlay / overlay controlled via field contents
  • Sign PDF documents
  • Create PDF / A-1b or PDF / A-3b compliant documents
  • Fill PDF Infofelder with the read metadata
  • Copy text / watermark – fixed or via contents / variables from the document
  • 1D / 2D Barcodes – fixed or via contents / variables from the document

The PDFmdx server also offers the possibility to re-name the documents, save them on the server in a folder structure, send them by e-mail, or print them using the PDF2Printerprint server. These functions can only be used directly at the PDFmdx server, but not yet via the PDFmdx-CL application.

PDFmdx-CL features:

  • Command line application for PDFmdx.

 

  • Web service communication (SOAP) – local (host) or remote PDFmdx processing service.
  • Processing of individual PDF files as well as all PDFs of a folder / ZIP file or folder structures.
  • User interface for the configuration as well as to set default settings.

  • Create job templates (name / description) and select the processing template (s). Processing templates are created via the PDFmdx editor and are stored on the PDFmdx server.

  • New processing jobs can be created using an already created job template and filled with documents (individual or entire folders) – Required parameters are either specified or are defaulted by default.

  • The results documents (PDF’s + metadata) are downloaded to the specified destination folder
  • Job details can be displayed through the job list.

 

Download – PDFmdx-CL Commandline Add-on Client für PDFmdx >>>

pdfFM – PDF Folder Merge – Convert documents with the same name to a total PDF (/A)

With PDFmdx, document stacks can be easily split into single documents according to the most diverse criteria and named range contents can be named. Sometimes, however, it may also be necessary to automatically create documents with the same name from different sources in a certain sequence automatically into an overall document.

For a customer project, we have developed pdfFM – an application where 3 folders are specified. When processing, the folders are searched for documents with the same name, the same documents are added to a new total PDF in the order of the specified folders and stored in a destination folder. If a file is missing in one of the folders, these documents are moved to the error folder. A log file logs the processing. The processing can be executed either interactively or via command line call.

In addition to the merge to an overall PDF, the output file can also be converted to an ISO PDF / A-1b, 2b or 3b file.

pdfFM - Konfiguration  pdfFM - Commandline Parameter

PDFmdx – Video – Automatically send invoices via EMail

This PDFmdx application example shows how a PDF document reads out areas and the information is subsequently used for automated email sending of the finished invoice.

  • Fields and areas are defined to: – read the company, the invoice number, the invoice date and the e-mail address from the document.
  • The input file is named based on the information read out. A PDF stationery is deposited. In addition, the read-out invoice number is applied to the invoice as a 1D bar code and a 2D QR code with a web link.
  • As a last step, an email message is generated via an HTML EMail template. Variables which have been inserted in the subject and in the message text are replaced with the read-out information. The PDF invoice as well as additional files are inserted as attachments and then automatically sent via an SMTP EMail server.

 

PDFmdx Version 3.2.5 available

Innovations PDFmdx Version 3.2.5:

  • New option for sending HTML emails – So far it was only possible to use external links, which were also available for the recipient, for pictures in the message. Now the images are embedded directly into the HTML message – either “all images” or “only the local images”. This means that no external resources accessible to all receivers need to be used.

HTML Body - Referenzierte Bilder werden im EMail eingebettet verschickt

  • If the option to preserve the creation date / time is activated, then this information is now also transferred from the output file for files or subfiles that are moved to the error folder.
  • The% COUNTER% variable now supports values> 9999
  • If the “Delete Blank Pages” function is active and a document is processed with only one blank page, it now correctly lands in the error folder and not in the destination folder.

Download – PDFmdx Template Editor & Processor >>>

PDFmdx Version 3.2.4 available

Innovations PDFmdx Version 3.2.4:

  • PDFmdx Editor – New HTML editor for the message text of the email function.

PDFmdx Editor - Neuer HTML Editor für den Body Text

  • E-mail sending – The option in the PDFmdx editor to activate the e-mail transmission was not visible and therefore could not be activated.
  • The PDF / A converter has been updated.
  • Error Correction in the PDFmdx Editor – An automatic save error was fixed when creating templates. The problem occurred only when you created the first aliases.

Download – PDFmdx Template Editor & Processor >>

PDFmdx – Version 3.2.2 available

Innovations PDFmdx Version 3.2.2:

  • PDFmdx is now a 64bit version and can only be installed on 64bit Windows versions. This requires the license to “move” on existing installations, that is, to return it to our license server and then retrieve it again. The new version also requires the .NET Runtime version 4.
  • New basic routines for PDF, Image, Barcode and OCR processing as well as for the extraction of text from the PDF
  • Extended list of supported 1D and 2D bar codes to detect barcodes and apply to the document.

Erweiterte Barcode Unterstützung für 1D und 2D Barcodes

  • PDF/A – 1b, 2b, 3bConversion and document output

PDFmdx kann Dokumente im Format PDFA 1 bis 3 ausgeben

  • New and improved feature to automatically detect and remove blank pages in black-and-white and color PDF documents. The percentage of the blackening serves as a parameter. In addition, the information about text in the background can also be used as a criterionThe test function now also displays the identified blank pages of the selected sample document as well as their degree of blackening. Empty pages are removed at the very beginning of PDFmdx processing.

Funktion um Leere Seiten über einen Schwellwert zu finden und zu löschen

  • You can sort the evaluation list of the blank recognition by the displayed columns in ascending or descending order.

Testfunktion zeigt an welche Seiten bei dem eingestellten Schwellwert löschbar sind

  • In the test from the PDFmdx editor, the name of the layout identified by the D condition is now displayed. This can be used to determine whether and which layout is recognized with the document being tested.
  • Simplified collection and modification of the conditions in the PDFmdx editor, eg. an AND / OR condition can now be inserted at the start node.

Bei den Bedigungen gibt es jetzt alle Möglichkeiten der nachträglichen Bearbeitung

  • Under the conditions, the page range definition is now processed correctly.

Bei jeder Bedingung kann festgelegt werden auf welchen Seiten diese geprüft werden sollen

  • Fuzzy / approximation search for conditions and anchor fields. Specifies how many characters a deviation from the specified string is still accepted – is available with the deactivated substring search.

Unschärfte = Fuzzy Funktion für die Bedingungen  Unschärfte = Fuzzy Funktion für die Ankerfeld Suche

  • Text areas / fields will now also be read out if the text box in the PDF exceeds the visible page margin.
  • Text search and selection / copy function: In the preview of the PDFmdx editor, a text can be searched forward or backward in the entire document. The text location found is highlighted. It is now also possible to highlight text in the editor and copy it to the clipboard.

Im PDFmdx Editor kann im Markiermodus Text in der Voransicht ausgewählt und kopiert werden  Im PDFmdx Editor kann nach Text-Strings vorwärts und rückwärts gesucht werden - die Fundstelle wird markiert

  • Function to accept and maintain the creation date or time of the output file for the target file. This information also includes variables for the path / file name as well as for the metadata output, for example over XLS available.

Erstellungs-Datum & Uhrzeit der Ursprungsdatei kann für die Ausgabedatei erhalten werden sowie als Variablen verwendet werden

  • When the variable for the filename of the input file is used, the filenames’ uppercase / lowercase is retained – so far the file name has always been converted to lowercase.
  • In the PDFmdx service processor, the max. Number of parallel processes of 1,2,3,4,5,10, Previously, the minimum value of 5 was up.

Die max. Anzahl an parallelen Verarbeitungsprozessen kann konfiguriert werden

  • The Web service interface via REST / SOAP is activated by default during installation.

Die Web-Service Schnittstelle ist jetzt standardmäßig aktiviert

New web service functions (REST / SOAP) for user and job template management.

The new features are included in the included .NET / C # sample project and can be tested with it. These extensions are required to implement the future PDFmdx Commandline application.

  • user management Create New Users, Delete, Reset Password / New, so far, there is only one “admin” user. Now it is also possible to create additional users. The jobs and the job templates are managed on the basis of the users. Additional users can be created via the “Admin”. The “admin” password can be reset by the PDFmdx service processor.

Web-Service Benutzerverwaltung

  • Job-Template FunctionTo create new jobs via the web service interface simply without much configuration effort, there is now also the possibility to use job templates. Job templates serve as a reference for new jobs. An existing job can be made via a checkbox to a job template. Jobs created via a template are referenced.

1_Neuen Job über Web-Service anlegen  2_Ein vorhandender Job kann als Templete verwendet werden um daraus neue Jobs anzulegen  3_Ein neuer Job Nummer #2 wurde durch Auswahl aus einem Template angelegt

Download – PDFmdx Template Editor & Processor >>>

PDFmdx – Version 2.8.1 available

Innovations PDFmdx Version 2.8.1:

Template synchronization:

The PDFmdx template editor can now match the locally created templates and layouts via a web service connection to one or more PDFmdx servers. This allows templates to be developed and tested locally, to be replicated to the processing servers later. Communication is via SOAP via http / https. This considerably simplifies and accelerates the matching and distribution of new and updated templates.

8_PDFmdx Editor - Abgleich von Vorlagen und Layouts mit entfernten Servern_#1  9_PDFmdx Editor - Abgleich von Vorlagen und Layouts mit entfernten Servern_#2

Textstamp with rotation angle

To apply a text not only horizontally, but at any angle, there is now the additional “angle” parameter.

6_Text-Stempel mit Drehwinkel Option  7_Text-Stempel mit Drehwinkel Option - Ergebnis

Anchor Field Search – New Features:

So far, the string for the positioning of the anchor field on the entire page has been searched (from top) and the first reference was assumed as the position for the anchor field. However, it may occur that the term is not the first but the next occurrence is the search position, and there is no other unique way to position the field over a search string. Therefore the function was extended.

By default, the anchor field search is now performed from the field position of the template. The next matching string is taken as the position for the anchor fields. A new addition is the ‘hit’ option. If it is activated and a number is specified, the page is scanned from top to bottom and left to right for the anchor text. The number indicates the number of hits as an anchor field position. So eg. the 2nd hit can be found on a page as an anchor field position.

4_Anker-Felder mit Teilstring Suche - Such-Treffer Nummer kann angegeben werden

AutoScale-function:

Especially in the case of scanned documents it can happen that the contents of the documents on the page vary not only in their positioning horizontally or vertically, but documents can also have different scaling and sizes. Z.B. For example a scanned expression with different scales was created. Although the relative position and size of the fields to be read is the same between the documents, the absolute values are different. The layout for the reading of the fields is created using a typical document and so far only considered the absolute distances and sizes of the fields. A document that appears about 10% smaller on the A4 page could not be processed, because the fields compared to the created layout both of the position so synonymous of the size does not fit. For this, we have now implemented an AutoScale function, which is able to automatically compensate for such different scaling to a certain extent.

5_AutoScale für Ankerfelder - gleicht Skalierungen der Dokumente aus

What is to be considered:

  • The layout should be created from the “largest” version
  • An anchor field must be used that can be found without partial string search. E.g. Via the string “Invoice” but not via “* Invoice *”
  • The “AutoScale” option must be activated.

Detect and remove blank pages:

When scanning documents, duplicate scans may contain blank pages (partially unprinted backs) in the document. Scanners do not always have a function to automatically remove them during the scanning process. For the further processing and archiving, empty pages are disturbing and should be able to be removed. With the current PDFmdx version 2.8.1 there is now a function to automatically detect and remove blank pages. The criterion for detecting a blank page is a threshold value which is set to 95% by default. We recommend a value between 95 and 98%. The value specifies the percentage of the “white pixels” on a page. A page is identified as “empty” as soon as the proportion of white pixels is greater than or equal to the set value, e.g. 95%. Blank pages are removed before all other PDFmdx processors are started.

1_Entferne alle leere Seiten aus den Dokument - vor dem Start der Verarbeitung mit Schwellwert-Parameter

Remove sides / blank pages after the separation pages:

If a document is split, the found separator page can also be deleted. New addition is now also a function to delete the following pages of the separation page. In this case, either a certain number of subsequent pages to be removed can be defined, or the function for automatic page identification / removal with threshold values can be used.

2_Entferne leere Seiten nach der Trennseite mit Schwellwert Parameter  3_Entferne eine eingestellte Zahl an Seiten nach der Trennseite

Regular Expression Parameters to selectively extract numbers from a field:

RegEx expression “\ d +” can be used to return numbers of a field. If no parameter is specified, we automatically return the “first of the longest of the found numbers”. (E.g., the read-out field content is “page 15/110”, “110” is returned). Together with the “Hits” parameter, a number of a specific position can be extracted from the string. With parameter = 1, the first number found in the string “15” is returned with 2 the second “110” and so on.

RegEx can also be used in combination with the additional string formulations:

Up to now, only the RegEx processing or alternatively the other string processing functions could be used. Now it is also possible to combine these two functions – RegEx can therefore be used together with the functions – partial string, remove – left / right / space / leading zeros as well as the function characters and type selection. The RegEx processing is executed first, regardless of the type of the field.

%TIME% VariableNow in 24 hour format

Update to SQL Compact Version 4.x – The version is now already included in the setup and does not have to be reloaded and installed as usual with version 3.5.

Download – PDFmdx Template Editor & Processor >>>