I am looking for either a Java application (or Perl script but my preference is Java) to parse an XML files that will be <a href="[login to view URL]">SRGS</a> format and render a simple text list of all possible combinations of phrases and words specified by the file.
The XML file will contain nested phrases for expansion and also optional phrases. Furthermore, SRGS files can reference external files (via URI's) for additional content.
## Deliverables
I will need to have the option of
**not** expanding ruleref URI's if they exist but reporting any that were encountered.
I will need to be able to limit the output to a given number of entries, some grammars could result in millions of combinations. A command line toggle to specify a maximum count.
I will need to be able to specify a specific rule id to expand from the command line, if the "root rule" is specified in the grammar this should be used as a default if a specific rule id is not provided.
I do not need the <tag> responses for a given phrase to be expanded although if this is easy to provide I would be willing to consider additional payment for it so please feel free to quote for it.
I only need to run utility this from the command line, however if this is supplied as a Java application an Eclipse 3.2 project would also be acceptable.
I will need the ability to resolve URI's as both file: and http: form. If https can be achieved without additional effort/cost then this would be a useful feature.
With regards to the SRGS standard, I will only be using XML form grammars. ABNF format can be completely ignored.
The output should be regular text suitable for processing with *nix utilities or importing into Microsoft Office applications.
The use of 3rd party components (under GPL, ASF, GNU etc.) is perfectly acceptable if they are appropriate.
Knowledge of VoiceXML may be useful background.
The intention of this utility is to test grammar coverage.
Thanks for looking - Robert.