Twitter Part-Of-Speech Tagger
Year: 2,013
Journal: Recent Advances in Natural Language Processing
Languages: English
Programming languages: Java
Input data:
Tweets
Output data:
Tags
Project website: https://gate.ac.uk/wiki/twitter-postagger.html
Part-of-speech information is a pre-requisite in many NLP algorithms. However, Twitter text is difficult to part-of-speech tag: it is noisy, with linguistic errors and idiosyncratic style. Further, we present a novel approach to system combination for the case where available taggers use different tagsets, based on voteconstrained bootstrapping with unlabeled data.