This page describes programs that will be made available upon the second release of the ANC. Beta test versions of the programs can be downloaded from here.
Since no tools currently support standoff annotations as used in the ANC we provide several tools to convert the document content and standoff annotations into XML documents. We also provide several Gate processing resources that enable it to load and save ANC documents.
Gate was used extensively during the development of the ANC. Since Gate represents documents as annotations graphs it is a natural companion to the ANC. These plugins permit Gate to load ANC documents just like any other document type..
A processing resourse that will load standoff annotations from a cesAna document and add the annotations to the selected document.
A language resource that allows Gate to load the document content and one or more of the standoff annotations at the same time. This resource will appear in Gate under "Language Resourse" -> "new" -> "XCES Document".
A processing resource that can be used to write the text content of a document to a file.
A processing resource that will save selected annotations to a standoff annotation (cesAna) XML file.
We also provide a SAX "like" parser and a graphical front end to the parser that can be used to generate XML documents from the ANC files. However, care should be taken as the parser will also produce invalid XML if asked to merge edge sets that contain overlapping annotations. It is up to the user to ensure that the content and annotations form a valid XML document. The logical markup is always loaded with the content and will (should) always form a valid XML document. The other standoff annotations are optional, however only one set of part of speech tags should be loaded at once as it is not guaranteed that their token annotations do not overlap. These tools should be considered to be of proof of concept quality rather than production grade programs.
This is package that provides JAXP (like) set of classes including implementations of SAXParserFactory, SAXParser, and XMLReader. Note: this package does not implement the full JAXP API, just enough to get simple tasks accomplished. However, it is robust enough that it can be used as the "XML parser" with Saxon to apply XSLT style sheets to ANC documents.
This is a graphical front end to the above parsers that can be used to preprocess the ANC files into XML.
Xoro is a simple scripting language for Gate that was written during the development of the ANC. Xoro is not a deliverable of the ANC, however it has proved so useful that we decided to make it available. Please note that Xoro is still very early in the development stages. You can find more information on Xoro on our Xoro page.