Results 1 - 10
of
152
Named Entity Recognition in Wikipedia
"... Named entity recognition (NER) is used in many domains beyond the newswire text that comprises current gold-standard corpora. Recent work has used Wikipedia’s link structure to automatically generate near gold-standard annotations. Until now, these resources have only been evaluated on newswire corp ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Named entity recognition (NER) is used in many domains beyond the newswire text that comprises current gold-standard corpora. Recent work has used Wikipedia’s link structure to automatically generate near gold-standard annotations. Until now, these resources have only been evaluated on newswire
Augmenting Wikipedia with Named Entity Tags
"... Wikipedia is the largest organized knowledge repository on the Web, increasingly employed by natural language processing and search tools. In this paper, we investigate the task of labeling Wikipedia pages with standard named entity tags, which can be used further by a range of information extractio ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
Wikipedia is the largest organized knowledge repository on the Web, increasingly employed by natural language processing and search tools. In this paper, we investigate the task of labeling Wikipedia pages with standard named entity tags, which can be used further by a range of information
Transforming Wikipedia into named entity training data
- In Proceedings of the Australasian Language Technology Association Workshop 2008
, 2008
"... Statistical named entity recognisers require costly hand-labelled training data and, as a result, most existing corpora are small. We exploit Wikipedia to create a massive corpus of named entity annotated text. We transform Wikipedia’s links into named entity annotations by classifying the target ar ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Statistical named entity recognisers require costly hand-labelled training data and, as a result, most existing corpora are small. We exploit Wikipedia to create a massive corpus of named entity annotated text. We transform Wikipedia’s links into named entity annotations by classifying the target
Entity ranking in Wikipedia
- In Proceedings of the 23rd Annual ACM Symposium on Applied Computing (SAC08
, 2008
"... The traditional entity extraction problem lies in the ability of extracting named entities from plain text using natural language processing techniques and intensive training from large document collections. Examples of named entities include organisations, people, locations, or dates. There are man ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
The traditional entity extraction problem lies in the ability of extracting named entities from plain text using natural language processing techniques and intensive training from large document collections. Examples of named entities include organisations, people, locations, or dates
Recall-Oriented Learning of Named Entities in Arabic Wikipedia
"... We consider the problem of NER in Arabic Wikipedia, a semisupervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
We consider the problem of NER in Arabic Wikipedia, a semisupervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity
Improved Text Categorisation for Wikipedia Named Entities
"... The accuracy of named entity recognition systems relies heavily upon the volume and quality of available training data. Improving the process of automatically producing such training data is an important task, as manual acquisition is both time consuming and expensive. We explore the use of a variet ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
variety of machine learning algorithms for categorising Wikipedia articles, an initial step in producing the named entity training data. We were able to achieve a categorisation accuracy of 95 % F-score over six coarse categories, an improvement of up to 5 % F-score over previous methods. 1
Using Wikipedia for Hierarchical Finer Categorization of Named Entities
"... Abstract. Wikipedia is one of the largest growing structured resources on the Web and can be used as a training corpus in natural language processing applications. In this work, we present a method to categorize named entities under the hierarchical fine-grained categories provided by the Wikipedia ..."
Abstract
- Add to MetaCart
Abstract. Wikipedia is one of the largest growing structured resources on the Web and can be used as a training corpus in natural language processing applications. In this work, we present a method to categorize named entities under the hierarchical fine-grained categories provided
Extracting geospatial entities from wikipedia
- IEEE International Conference on Semantic Computing
, 2009
"... This paper addresses the challenge of extracting geospa-tial data from the article text of the English Wikipedia. In the first phase of our work, we create a training corpus and select a set of word-based features to train a Support Vec-tor Machine (SVM) for the task of geospatial named entity recog ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper addresses the challenge of extracting geospa-tial data from the article text of the English Wikipedia. In the first phase of our work, we create a training corpus and select a set of word-based features to train a Support Vec-tor Machine (SVM) for the task of geospatial named entity
A Named Entity Labeler for German: exploiting Wikipedia and distributional clusters
"... Named Entity Recognition is a relatively well-understood NLP task, with many publicly available training resources and software for English. Other languages tend to be underserved in this area. For German, CoNLL-2003 provides training data, but there are no publicly available, ready-to-use tools. We ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Named Entity Recognition is a relatively well-understood NLP task, with many publicly available training resources and software for English. Other languages tend to be underserved in this area. For German, CoNLL-2003 provides training data, but there are no publicly available, ready-to-use tools
Named Entity Corpus Construction using Wikipedia and DBpedia Ontology
"... Abstract In this paper, we propose a novel method to automatically build a named entity corpus based on the DBpedia ontology. Since most of named entity recognition systems require time and effort consuming annotation tasks as training data. Work on NER has thus for been limited on certain language ..."
Abstract
- Add to MetaCart
languages like English that are resource-abundant in general. As an alternative, we suggest that the NE corpus generated by our proposed method, can be used as training data. Our approach introduces Wikipedia as a raw text and uses the DBpedia data set for named entity disambiguation. Our method is language
Results 1 - 10
of
152