Year: 2,013
Journal: Conference on Empirical Methods in Natural Language Processing
Languages: Arabic, Czech, English, German, Hungarian, Spanish
Programming languages: Java
Input data:

“unvocalized and pretokenized transliterations as input”

Output data:


In this paper, we demonstrate that fast and accurate CRF training and tagging is possible for large tagsets of even thousands of tags by approximating the CRF objective function using coarse-to-fine decoding. Our pruned CRF (PCRF) model has much smaller runtime than higher-order CRF models and may thus lead to an even broader application of CRFs across NLP tagging tasks.

