Category: Office2PDFA

FileConverter – version 1.0.27 – supports MS-Office 2010 as converter

With the new version 1.0.27 of the FileConverter service now also MS-Office 2010 is supported for the conversion of MS-Word, MS-Excel and MS-PowerPoint files to PDF, PDF/A and TIFF. MS-Office can be configured and used parallel to the consisting, direct conversion or mixed, after folders or e-mail boxes. The conversion is, like with the direct conversion, done silent in the background via the FileConverter Windows service. Required is the installation of the 32bit version of MS-Office on the computer and that the user of the service has opened the MS-Office applications at least once.

With the usage of MS-Office as converter engine 100% quality and the support of all MS-Office features can be guaranteed, which can only can only be reached with the usage of the original application.

To take advantage of the available computer resources or to guarantee an optimal throughput the conversions get, depending on the configuration, processed parallel also. The set standard is 5 parallel processes.

FileConverter - MS-Office as converter

Download – FileConverter – documents & e-mails to PDF, PDF/A and TIFF >>>

FileConverter – automatically convert documents and e-mails from folders or e-mail boxes to PDF, PDF/A and TIFF

The FileConverter is an application, installable as service in MS-Windows (32 and 64bit), to monitor folders and e-mail boxes and automatically convert the contained documents to the PDF, PDF/A or TIFF file format. With that, multiple folders or also MS-Exchange and POP3 mailboxes can be configured and monitored.

The following input-documentformats are supported:

  • DOC, DOCX, RTF, TXT,
  • XLS, XLSX,
  • PPT, PPTX,
  • XFDF, FDF,
  • PNG, BMP, TIF, TIFF, JPG, JPEG
  • ZIP, RAR, 7Z,
  • MSG, EML,
  • PDF,
  • HTM, HTML, MHTML,
  • PMT, PMTX

file format – features:

  • With ZIP/RAR/7Z containers, all containing and supported documents get automatically extracted and converted. The containing folder structure of the container gets build in de output directory.
  • PMT and PMTX – are PDFMerge XML dataformats – which contain hierarchic structure information as well as links to the documents or the documents themself. The FileConverter produces from this files, like the PDFMerge server, a single total PDF file, which is merged from the to PDF converted single documents. The structure defined in the XML gets displayed as PDF-bookmarks.

Conversion:

  • The PDF/TIFF conversion takes place directly without the usage of the source application. So for the processing, no installation of MS-Office or Adobe Acrobat is necessary. Optional, the PDF’s also can be exported in the ISO standardized PDF/A-1b format.
  • In the standard scope also the iOCR engine, for creation of searchable PDF(/A)’s out of PDF or image documents, is implemented. Optional – also Abbyy, the most efficient OCR engine at the moment, can be installed. With the OCR processing, PDF documents get analyzed page by page and only documents which don’t include text information yet get processed (intelligent OCR processing) – this saves resources and increases the quality and the processing speed.

Functions – general:

  • MS-Windows service application for document conversion of MS-Office, PDF, image, HTML, ZIP, MSG and e-mail to PDF, PDF/A or TIFF
  • Multiple folders as well as MS-Exchange and POP3 e-mail boxes can be monitored and processed parallel.
  • Direct conversion without usage of additional necessary source applications (MS-Office, Adobe Acrobat)  or printer drivers.
  • Flattened of filled PDF forms: PDF forms (XFDF,FDF) can be converted into normal PDF documents. The forms either can be deposited fixed or newly loaded every time.
  • Parallel processing with configurable amount of processes – allows the optimal exploitation of the hardware und garants the fast processing.
  • Logging of all conversion instances, forwarding of failed e-mail conversions or sending of error – e-mails via SMTP

In / out folder processing:

  • Processing of files and folders out of configured in / out – folders via time lapse or “ready” file, incl. subfolder processing (one level)
  • Erstellen einer Index-Text-Datei über alle bei einem Verarbeitungsvorgang erzeugten Dateien.
  • After the processing: deleting, moving into archive folder, renaming – of the files or folders (.con / .err)
  • Configuration of the filename extension which shouldn’t be converted – these get ignored and not processed. E-mails with attachments and not identifyable extensions get handled as errors and forwarded to an e-mail address.
  • Single page output with configurable amount of locations for the site index
  • Configuration of the TIFF conversion – compression / color depth / resolution / JPEG-quality
  • extensive parameters for the OCR processing – iOCR or Abbyy – the FileConverter has the same OCR functions as AutoOCR
  • Parameters for the HTML conversion – page size and margins – HTML document and e-mails get scaled automatically.

Processing of e-mail boxes:

  • Processing of POP3 / MS-Exchange e-mail boxes – forwarding  or deleting at successful or incorrect processing, or moving into an archive / error folder under MS-Exchange. Direct access to MS-Exchange 2007/2010/2013 through the SOAP web-service-interface.
  • EML and MSG – body and attachments get converted – generation of the e-mail header information in the body document – from, date, to, subject
  • Output of a XML-file with the processed e-mails with the metadata and file-links – configurable: from, to , cc, bcc, received, subject, body, attachments
  • Output per e-mail in separated subfolders or “flat” in the destination folder.

 

1_FileConverter - general settings - email & folder processing 2_FileConverter - processing options  3_FileConverter - service configuration  4_Fileconverter - SMTP server configuration  5_FileConverter - configuration folder processing  6_FileConverter - configuration e-mail box processing  7_FileConverter - MS-Exchange configuration  8_FileConverter - POP3 configuration  9_FileConverter - TIFF conversion settings  10_FileConverter - OCR settings  11_FileConverter - HTML conversion settings  12_FileConverter - Log

  Download – FileConverter – documents & e-mails to PDF, PDF/A and TIFF >>>

Office2PDFA – Scripting Support

Office2PDFA now also supports scripting. It supports two CLR languages: VB.NET & C#.  CLR = Common Language Routines, ie. C# and VB.NET are based on (implements the) Common Language Routines – there are other CLR programming languages J#, IronPhyton. VBScript is also supported but only for 32bit version.

It is possible to execute script before, and after the conversion. The script execution can be enabled/disabled. The scripts for before and after can be written in different programming languages.

A script consists of a list of functions and declarations. The runtime will generate a class from the functions and declarations and will execute the “Run” method. The Run method has one parameter of type IScriptContext.

IScriptContext properties:

  • SkipConversion (boolean): if the value is set to true, the application will not convert the document to PDF. The script is responsible to convert the document and write the result to a temporary folder. The path of the temportary folder should be stored in the DestinationFile property. The destination file type (extension) should be specified also from the script,  using the DestinationFileExt property.
  • DestinationFile (string): the fully qualified path of a file
  • DestinationFileExt (string): the extension of the destination file
  • FolderPath: the path of the folder of the input file
  • RootFolderPath: the root folder path, it can be different than the FolderPath property in the case if the subfolder monitoring is enabled
  • Error (boolean): the script should set this property some errors are occured.
  • ErrorDescription (string): the script should set this property and provide a description of the error.
  • FilePath: the fully qualified path of the source file
  • Folder (IFolder): a reference to an Office2PDF monitored folder.

Folder properties – use this object to get additional information about the monitored folder:

  • Name (the name of the monitored folder)
  • InputFolder (path to the input folder)
  • OutputFolder (path of the output foldre)
  • ErrorFolder (error folder)
  • ArchiveFolder (archive folder)

Sample VB.NET Script to convert a MS-Word DOCX to a DOC document – The following pre-script can be used to convert DOCX documents to DOC:

Download – Sample script to convert DOCX to DOC >>>

It is also possible to specify additional assembly references which are used by the scripts. For this script we used the following references:

  • System.Windows.Forms.dll
  • System.Data.dll
  • System.Drawing.dll
  • System.Xml.dll
  • Microsoft.VisualBasic.dll

The workflow of the Office2PDFA remains, the destination file is handled like a normal PDF conversion result. All options are available. PDF metadata, and other PDF properties cannot be applied.

Office2PDFA_Scripting_sample_docx_to_doc_#1 Office2PDFA_Scripting_sample_docx_to_doc_#2

Additional info about IScriptContext which is used by the Run method

It has 2 additional methods:

  • GetParam(name, defValue)
  • SetParam(name, value)

You can store in the IScriptContext parameter your script state data if you want to transmit some data from the pre-script to the post-script, because the same IScriptContext object is used for both scripts to ensure the correct workflow.

The post action script’s parameter contains all information about:

  • the source file/folder, subfolder
  • the desitnation file/folder, subfolder
  • the error (if any)

It is possible to use only the „post“ action script, or only the „pre“ script or both. Scripts can be written in C#, VB.NET or VBScript.  All features of the .NET framework and all features of these programming languages are available. The script security context (Evidence) is the security context of the Office2PDFA application. That means that the script will have the same security context like the application.

 

Office2PDFA also supports VBScript:

The method signature should look like:

sub Run(byref context)
end sub

all parameters are also available with VBScript