The XML presentation of the grammar is not intended to be human readable/writeable, but rather to be easy readable for the Chaperon components. It is recommended to use this text grammar format and convert it to the XML presentation.


Compared to the text grammar of the standard parser, the new text format doesn't split tokens and definition anymore. Definition and abbreviations and special instructions are all mixed with each other.

%ab int : "Integer" ;

integers : int ( ws int )*;

%start "integers" ;

The declaration "%start" declares the root definition for the result document.


Definition are definition for the xml element, which the parser output include.

WORD : [A-Za-z] [a-z]* ;

The definition, which occurs first, gets a higher priority as the following definitions.


Alternation means that one of the contained elements must match.

CHAR : "[A-Za-z] | [0-9]";


Concatenation means that all elements in a sequence must match.

IDENTIFIER : [A-Za-z] [A-Za-z0-9_]*;

Character classes

A character class compares a character to the characters which this class contains. There are two options for a character class. Either a character class or a negated character class. The negated character class implies that the character should not match.

PUNCTUATION : [.,;?!] ;
NOTNUMBER   : [^0-9] ;

Universal character

This character matches all characters except carriage return and line feed

COMMENT : "//" .*;


If an regular expression is often used, you can use an abbreviation for it

%ab NUMBER : [0-9];
INT        : NUMBER+;
by Stephan Michels