MarMot
Year: 2,013
Journal: Conference on Empirical Methods in Natural Language Processing
Languages: Arabic, Czech, English, German, Hungarian, Spanish
Programming languages: Java
Input data:
“unvocalized and pretokenized transliterations as input”
Output data:
Tags
In this paper, we demonstrate that fast and accurate CRF training and tagging is possible for large tagsets of even thousands of tags by approximating the CRF objective function using coarse-to-fine decoding. Our pruned CRF (PCRF) model has much smaller runtime than higher-order CRF models and may thus lead to an even broader application of CRFs across NLP tagging tasks.