Pos Tagger

A Pos tagger allows users to assign parts of speech to each token. As shown in Figure 8.5, CLAMP-Cancer currently provides only one pos tagger, DF_OpenNLP_pos_tagger, designed specifically for clinical text. This tagger is built from re-training the OpenNLP pos tagger on a dataset of clinical notes, namely, the MiPACQ corpus. (http://clear.colorado.edu/compsem/index.php?page=endendsystems&sub=mipacq) . Advanced users can use the config.conf file to change the default pos tagger modelmipacq_pos.bin.

 OpenNLP_Tokenizer
DF_OpenNLP_pos_tagger and its configuration files