The NITE Object Model Library for Handling Structured Linguistic Annotation on Multimodal Data Sets

J. Carletta, J. Kilgour, T. O’Donnell, S. Evert, Holger Voormann

NLPXML

Abstract

The NITE Object Model Library is an implemented set of routines for loading, accessing, manipulating, and serializing linguistic data. It is similar in spirit to the data handling provided by the Annotation Graph Toolkit, but is aimed at data that is heavily cross-annotated with structured information, and thus chooses higher expressivity at the cost of processing speed. We describe our open-source implementation and the XML-based data storage format that it assumes, and discuss the circumstances under which it is a useful addition to previous data handling techniques.