Using Jakarta Ant

The Chaperon project includes a Ant task. With this task you can parse a branch of text files and convert them to XML. You specify the text file with the attribute srcdir and include attribute. With the given lexicon and grammar files the text text file will lexical analysed and parsed to XML files. If you doesn't specify a grammar the input files will only be lexical analysed. The task supports mappers to map the input files to the output files.

<taskdef name="chaperon"
         classname="net.sourceforge.chaperon.adapter.ant.ParserTask"/>

<chaperon srcdir="src/examples/"
          destdir="build/"
          lexicon="src/grammars/test1.xlex"
          grammar="src/grammars/test1.xgrm">
 <include name="*.txt"/>
 <mapper type="glob" from="*.txt" to="*.xml"/>
</chaperon>

Following list of attributes can be used.

AttributeDescriptionRequired
srcdirLocation of the input files.yes
destdirLocation to store the XML files.yes
cachedirLocation to store intermediate files, which increase the performance.no
lexiconLocation of the lexicon file.yes
grammarLocation of the grammar file. If you don't specify the attribute, the files will only lexical analysed.no
includesComma-separated list of patterns of files that must be included; all files are included when omitted.no
excludesComma-separated list of patterns of files that must be excluded; no files (except default excludes) are excluded when omitted.no
indentIf the generated XML file should be indentedno
msglevelTo specify the logging level (DEBUG / INFO / WARN / ERROR / FATAL)no
encodingEncoding for the text input documents.no
inputtypeIf the task should consume text file or XML file. If you choosing XML files, the task will dispatch text fragment mark with <text> elements.(text / xml)no
flattenIf the task should produce a more flatten XML hirachy, which means elements which the same name will be collapsedno

And following nested elements can be used.

ElementAttributesDescriptionRequired
includenamePattern of files that must be included.no
excludenamePattern of files that must be excluded.no
mappertype, from, toMapping the input file to the output files.no
by Stephan Michels