Results 11 -
15 of
15
Discovering Lexical Information by Tagging Arabic Newspaper Text
- University of Montreal
, 1998
"... In this paper xve describe a system for building an Arabic lexicon automatically by tagging Arabic newspaper text. In this system we are using several techniques for tagging the words in the text and figuring out their types and their features. The major techniques that we are using are: finding phr ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
In this paper xve describe a system for building an Arabic lexicon automatically by tagging Arabic newspaper text. In this system we are using several techniques for tagging the words in the text and figuring out their types and their features. The major techniques that we are using are: finding phrases, analyzing the affixes of the words, and analyzing their patterns. Proper nouns are particularly difficult to identify in the Arabic language; we describe techniques for isolating them.
Semi-Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision
"... Table of contents List of tables........................................................................................................................ iv List of figures....................................................................................................................... v Abstrac ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Table of contents List of tables........................................................................................................................ iv List of figures....................................................................................................................... v Abstract............................................................................................................................... vi
Document Analysis at DFKI - Part 2: Information Extraction
, 1995
"... Document analysis is responsible for an essential progress in office automation. This paper is part of an overview about the combined research efforts in document analysis at DFKI. Common to all document analysis projects is the global goal of providing a high level electronic representation of d ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Document analysis is responsible for an essential progress in office automation. This paper is part of an overview about the combined research efforts in document analysis at DFKI. Common to all document analysis projects is the global goal of providing a high level electronic representation of documents in terms of iconic, structural, textual, and semantic information. These symbolic document descriptions enable an "intelligent" access to a document database. Currently there are three ongoing document analysis projects at DFKI: INCA, OMEGA, and PASCAL2000/PASCAL+. Although the projects pursue different goals in different application domains, they all share the same problems which have to be resolved with similar techniques. For that reason the activities in these projects are bundled to avoid redundant work. At DFKI we have divided the problem of document analysis into two main tasks, text recognition and information extraction, which themselves are divided into a set of s...
Algorithms
"... The usual approach to named-entity detection is to learn extraction rules that rely on linguistic, syntactic, or document format patterns that are consistent across a set of documents. However, when there is no consistency among documents, it may be more effective to learn document-specific extracti ..."
Abstract
- Add to MetaCart
The usual approach to named-entity detection is to learn extraction rules that rely on linguistic, syntactic, or document format patterns that are consistent across a set of documents. However, when there is no consistency among documents, it may be more effective to learn document-specific extraction rules. This paper presents a knowledge-based approach to learning rules for named-entity extraction. Documentspecific extraction rules are created using a generate-andtest paradigm and a database of known named-entities. Experimental results show that this approach is effective on Web documents that are difficult for the usual methods.
promises
"... This breakthrough method for sorting through reams of text, linking relevant information while ignoring the irrelevant, has stimulated research into natural language processing and ..."
Abstract
- Add to MetaCart
This breakthrough method for sorting through reams of text, linking relevant information while ignoring the irrelevant, has stimulated research into natural language processing and

