MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains (1996) [68 citations — 13 self]

Abstract:

this paper, we describe experiments with AutoSlog in two additional domains: joint ventures and microelectronics. We compare the performance of AutoSlog across the three domains, discuss the lessons learned about the generality of this approach, and present results from two experiments which demonstrate that novice users can generate effective dictionaries using AutoSlog. 1 Introduction Portability is a crucial concern for researchers in knowledge-based natural language processing (NLP). Knowledge-based NLP systems typically rely on a conceptual dictionary that has been manually encoded for a specific domain. Although knowledge-based systems have performed well on certain tasks (e.g., [2,4,5,11,16,23]), these systems will not be practical for real world applications until the knowledge that they need can be acquired automatically. Preprint submitted to Elsevier Preprint 21 March We have developed a system called AutoSlog that generates conceptual dictionaries for information extraction automatically. Information extraction (IE) is essentially a form of text skimming, in which specific types of information are extracted from text. There has been a lot of work recently on information extraction in conjunction with the recent message understanding conferences [26--28]. Most information extraction systems rely on a manually encoded dictionary of extraction patterns (e.g., see [12,15,1]). Using AutoSlog, the UMass/MUC-4 system was the first system that could acquire domainspecific extraction patterns automatically [17,18]. In previous work, we showed that AutoSlog could create effective extraction patterns for the domain of terrorism [30]. A dictionary generated by AutoSlog for the terrorism domain achieved 98% of the performance of a handcrafted dictionary that required a...

Citations

2526 Induction of decision trees – Quinlan - 1986
1196 Building a large annotated corpus of English: the penn treebank – Marcus, Marcinkiewicz, et al. - 1993
523 Knowledge Acquisition via Incremental Concept Formation – Fisher - 1987
453 Explanation-based generalization: A unified view – Mitchell, Keller, et al. - 1986
322 Explanation-based learning: An alternative view – DeJong, Mooney - 1986
151 Automatically Constructing a Dictionary for Information Extraction Tasks – Riloff - 1993
144 Frequency Analysis of English Usage – Francis - 1982
111 Coping with ambiguity and unknown words through probabilistic models – Weischedel, Meteer, et al. - 1993
100 Information extraction as a basis for highprecision text classification – Riloff, Lehnert - 1994
65 Id5: an incremental id3 – Utgoff - 1988
63 An overview of the FRUMP system – DeJong - 1982
60 FOULUP: a program that figures out meanings of words from context – Granger - 1977
57 Construe/tis: a system for content-based indexing of a database of news stories – Hayes, Weinstein - 1990
56 Script Application: Computer Understanding of Newspaper Stories" Res.Report #116 – Cullingford - 1978
47 Symbolic/Subsymbolic Sentence Analysis: Exploiting the Best of Two Worlds – Lehnert - 1990
37 Retrieval performance in FERRET: a conceptual information retrieval system – Mauldin - 1991
35 Acquiring Lexical Knowledge from Text: A Case Study – Jacobs, Zernik - 1988
34 Subjective understanding: Computer models of belief systems – Carbonell - 1979
34 University of massachusetts: Description of the CIRCUS system as used for MUC-4 – Lchncrt, Cardie, et al. - 1992
31 Automatically deriving structured knowledge bases from on-line dictionaries – Dolan, Vanderwende, et al. - 1993
26 Vanderwende: Structural patterns vs. string patterns for extracting semantic information from dictionaries – Montemagni, L - 1992
24 Acquisition of Semantic Patterns for Information Extraction from Corpora – Kim, Moldovan - 1993
23 Umass/hughes: Description of the circus system used for muc-5 – Lehnert, McCarthy, et al. - 1993
23 Automatically constructing a dictionary for information extraction tasks – Rilo - 1993
20 Towards a Self-Extending Parser – Carbonell - 1979
16 University of Massachusetts: MUC-4 Test Results and Analysis – Lehnert, Cardie, et al. - 1992
8 SRI International: Description of the FASTUS System Used for MUC-4 – Hobbs, Appelt, et al. - 1992
8 Information extraction as a basis for high-precision text classi cation – Rilo, Lehnert - 1994
6 Information Extraction as a Basis for Portable Text Classification Systems – Riloff - 1994
6 ID5: An Incremental ID3 – Utgo - 1988
5 University of Massachusetts: MUC-3 Test Results and Analysis – Lehnert, Williams, et al. - 1991
5 Automatically Acquiring Conceptual Patterns Without an Annotated Corpus – Riloff, Shoen - 1995
4 GE NLTOOLSET: Description of the System as Used for MUC-4 – Krupka, Jacobs, et al. - 1992
4 UMass/Hughes: Description of the CIRCUS System as Used for MUC-5 – Cardie, Peterson, et al. - 1993
3 A Dictionary Construction Experiment with Domain Experts – Riloff, Lehnert - 1993
1 BBN PLUM: Description of the PLUM System as Used for MUC-4 – Ayuso, Boisen, et al. - 1992
1 University ofMassachusetts: MUC-4 Test Results and Analysis – Lehnert, Cardie, et al. - 1992
1 Information Extraction as a Basis for Portable Text Classi cation Systems – Rilo - 1994