Keyword Extraction Given More Linguistic Knowledge
Year: 2,003
Journal: Conference on Empirical Methods in Natural Language Processing
Languages: English
Programming languages: Shell
Input data:
Plain text
Output data:
keywords
Project website: https://github.com/boudinfl/hulth-2003-pre
In this paper, experiments on automatic extraction of keywords from abstracts using a supervised machine learning algorithm are discussed. The main point of this paper is that by adding linguistic knowledge to the representation (such as syntactic features), rather than relying only on statistics (such as term frequency and ngrams), a better result is obtained as measured by keywords previously assigned by professional indexers.