Apache UIMA, a robust software architecture for natural language processing
UIMA is in the process of being standardized by the OASIS, it defines a software architecture for unstructured data processing. The Apache implementation of the standard defined by IBM, and continued by the Apache Foundation, offers new perspectives to the NLP domain, especially with components reusability, prototyping and corpus sharing.
We will briefly introduce the key concepts of UIMA and discute its strong points, especially for natural language processing.
Nicolas Hernandez is an Associate Professor in Computer Sciences at
the University
of Nantes and member of the LINA laboratory (UMR 6241 CNRS). His
research interests
revolve around semantic and discourse analysis, discourse modeling, knowledge
acquisition from corpora, hybrid approaches (statistics and linguistics),
and NLP applications such as summarization, and reader and writer assistance.
As part of its research activities he is involved in various research projects
among them : ANR PIITHIE, ANR C-Mantic, Region Miles. He is a member of
the editorial
board of Discours.revues.org journal. In December 2008 he has been selected as
an IBM Initiative award recipient for his proposal about building a
French speaking
community around the Apache UIMA framework.
Documents joints
