An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains (1996)
Cached
Download Links
- [www.sics.se]
- [dfki.de]
- [www.cs.sci.ku.ac.th]
- DBLP
Other Repositories/Bibliography
| Venue: | Artificial Intelligence |
| Citations: | 73 - 14 self |
BibTeX
@ARTICLE{Riloff96anempirical,
author = {Ellen Riloff},
title = {An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains},
journal = {Artificial Intelligence},
year = {1996},
volume = {85},
pages = {101--134}
}
Years of Citing Articles
OpenURL
Abstract
this paper, we describe experiments with AutoSlog in two additional domains: joint ventures and microelectronics. We compare the performance of AutoSlog across the three domains, discuss the lessons learned about the generality of this approach, and present results from two experiments which demonstrate that novice users can generate effective dictionaries using AutoSlog. 1 Introduction Portability is a crucial concern for researchers in knowledge-based natural language processing (NLP). Knowledge-based NLP systems typically rely on a conceptual dictionary that has been manually encoded for a specific domain. Although knowledge-based systems have performed well on certain tasks (e.g., [2,4,5,11,16,23]), these systems will not be practical for real world applications until the knowledge that they need can be acquired automatically. Preprint submitted to Elsevier Preprint 21 March We have developed a system called AutoSlog that generates conceptual dictionaries for information extraction automatically. Information extraction (IE) is essentially a form of text skimming, in which specific types of information are extracted from text. There has been a lot of work recently on information extraction in conjunction with the recent message understanding conferences [26--28]. Most information extraction systems rely on a manually encoded dictionary of extraction patterns (e.g., see [12,15,1]). Using AutoSlog, the UMass/MUC-4 system was the first system that could acquire domainspecific extraction patterns automatically [17,18]. In previous work, we showed that AutoSlog could create effective extraction patterns for the domain of terrorism [30]. A dictionary generated by AutoSlog for the terrorism domain achieved 98% of the performance of a handcrafted dictionary that required a...







