Overview
The Text to XML preprocessor parses plain text documents into XML for subsequent formatting in an eFORMz document. This is especially useful when source data spans multiple pages.
The Text to XML conversion requires adding the Text to XML preprocessor to an eFORMz project file and configuring the data to identify its structure and assign data an XML schema. This configuration includes rules and definitions for locating, extracting and naming data elements from the original print output. It uses a collection of lines, parameters, states and end states to define how the source data is parsed into the XML output.
Requirements
You must have the following available to use the Text to XML preprocessor:
- A plain text file with your data.
- An eFORMz project (.efz) file that will process the data.
- A basic knowledge of XML.
- Basic knowledge of regular expressions is useful.
Procedure
Preparing Your File for Import
To ensure a smooth import process, please prepare your data in a plain text (.txt) format. PCL files are also supported. Your file can include one or multiple documents, and each document can vary in length.
⚠️ Note: PDF, XML, and other file formats are not supported and cannot be processed.
Keep in mind that very large documents (several thousand pages) may exceed system memory limits, since the entire file is loaded into memory when opened in the Text to XML Editor.
If you're working with CSV files, do not use the Text to XML preprocessor. Instead, use the Flat File preprocessor, as detailed in our guide on Importing CSV Records.
Add the Text to XML Converter Preprocessor
- Open your eFORMz project and load a text data file.
- Right click the project name > Add Preprocessor > Text to XML Converter.
- In the Text to XML Converter Properties window:
- Click the checkbox (enable) Show editor while in composer.
- If your text file is not in Roman8 character set, select the character set that it uses.
- Click Text to XML Editor.
- Several Lines are prepopulated for your convenience.
- Each one corresponds to a line in the document.
- Lines can have one or more parameters.
- The window looks like the below image.
- At the bottom of the lines is a tab to view states.
- Below the tabs is the Line Editor.
- Most work is done in the Line Editor for lines and the State Editor for states.
- Click Exit at the lower right to exit the Text to XML Editor.
- You will be prompted to save your changes if you have not saved them.
- The Text to XML lines and states are stored in the eFORMz project file.
Assistance
If any further assistance is needed, please contact our Support team.
Comments
0 comments
Article is closed for comments.