Results 1 - 10
of
2,437
Towards a Standard Upper Ontology
, 2001
"... The Suggested Upper Merged Ontology (SUMO) is an upper level ontology that has been proposed as a starter document for The Standard Upper Ontology Working Group, an IEEE-sanctioned working group of collaborators from the fields of engineering, philosophy, and information science. The SUMO provides d ..."
Abstract
-
Cited by 589 (22 self)
- Add to MetaCart
definitions for general-purpose terms and acts as a foundation for more specific domain ontologies. In this paper we outline the strategy used to create the current version of the SUMO, discuss some of the challenges that we faced in constructing the ontology, and describe in detail its most general concepts
Automatically Constructing a Dictionary for Information Extraction Tasks
, 1993
"... Knowledge-based natural language processing systems have achieved good success with certain tasks but they are often criticized because they depend on a domain-specific dictionary that requires a great deal of manual knowledge engineering. This knowledge engineering bottleneck makes knowledge-b ..."
Abstract
-
Cited by 263 (22 self)
- Add to MetaCart
Knowledge-based natural language processing systems have achieved good success with certain tasks but they are often criticized because they depend on a domain-specific dictionary that requires a great deal of manual knowledge engineering. This knowledge engineering bottleneck makes knowledge-based
Automatically Generating Extraction Patterns from Untagged Text
- Department of Computer Science, Graduate School of Arts and Science, New York University
, 1996
"... Many corpus-based natural language processing systems rely on text corpora that have been manually annotated with syntactic or semantic tags. In particular, all previous dictionary construction systems for information extraction have used an annotated training corpus or some form of annotated input. ..."
Abstract
-
Cited by 373 (32 self)
- Add to MetaCart
Many corpus-based natural language processing systems rely on text corpora that have been manually annotated with syntactic or semantic tags. In particular, all previous dictionary construction systems for information extraction have used an annotated training corpus or some form of annotated input
Constructing Biological Knowledge Bases by Extracting Information from Text Sources
, 1999
"... Recently, there has been much effort in making databases for molecular biology more accessible and interoperable. However, information in text form, such as MEDLINE records, remains a greatly underutilized source of biological information. We have begun a research effort aimed at automatically mappi ..."
Abstract
-
Cited by 265 (0 self)
- Add to MetaCart
mapping information from text sources into structured representations, such as knowledge bases. Our approach to this task is to use machine-learning methods to induce routines for extracting facts from text. We describe two learning methods that we have applied to this task --- a statistical text
DBpedia -- A Crystallization Point for the Web of Data
, 2009
"... The DBpedia project is a community effort to extract structured information from Wikipedia and to make this information accessible on the Web. The resulting DBpedia knowledge base currently describes over 2.6 million entities. For each of these entities, DBpedia defines a globally unique identifier ..."
Abstract
-
Cited by 374 (36 self)
- Add to MetaCart
of information and covers domains such as geographic information, people, companies, films, music, genes, drugs, books, and scientific publications. This article describes the extraction of the DBpedia knowledge base, the current status of interlinking DBpedia with other data sources on the Web, and gives
Information extraction: Identifying protein names from biological papers
- In Proceedings of the Pacific Symposium on Biocomputing '98 (PSB'98
, 1998
"... To solve the mystery of the life phenomenon, we must clarify when genes are expressed and how their products interact with each other. But since the amount of continuously updated knowledge on these interactions is massive and is only available in the form of published articles, an intelligent infor ..."
Abstract
-
Cited by 286 (7 self)
- Add to MetaCart
information extraction (IE) system is needed. To extract these information directly from articles, the system must rstly identify the material names. However, medical and biological documents often include proper nouns newly made by the authors, and conventional methods based on domain speci c dictionaries
Learning to Construct Knowledge Bases from the World Wide Web
, 2000
"... The World Wide Web is a vast source of information accessible to computers, but understandable only to humans. The goal of the research described here is to automatically create a computer understandable knowledge base whose content mirrors that of the World Wide Web. Such a knowledge base would ena ..."
Abstract
-
Cited by 242 (5 self)
- Add to MetaCart
enable much more effective retrieval of Web information, and promote new uses of the Web to support knowledge-based inference and problem solving. Our approach is to develop a trainable information extraction system that takes two inputs. The first is an ontology that defines the classes (e.g., company
Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity
, 1998
"... Most databases contain "name constants" like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of heterogeneous databases has assumed that local name constants can be mapped into an appropriate global domain by norma ..."
Abstract
-
Cited by 247 (13 self)
- Add to MetaCart
by normalization. However, in many cases, this assumption does not hold; determining if two name constants should be considered identical can require detailed knowledge of the world, the purpose of the user's query, or both. In this paper, we reject the assumption that global domains can be easily constructed
Clustering by compression
- IEEE Transactions on Information Theory
, 2005
"... Abstract—We present a new method for clustering based on compression. The method does not use subject-specific features or background knowledge, and works as follows: First, we determine a parameter-free, universal, similarity distance, the normalized compression distance or NCD, computed from the l ..."
Abstract
-
Cited by 297 (25 self)
- Add to MetaCart
Abstract—We present a new method for clustering based on compression. The method does not use subject-specific features or background knowledge, and works as follows: First, we determine a parameter-free, universal, similarity distance, the normalized compression distance or NCD, computed from
Automatic Acquisition of Domain Knowledge for Information Extraction
- In Proceedings of the 18th International Conference on Computational Linguistics
, 2000
"... In developing an Information Extraction (IE) system for a new class of events or relations, one of the major tasks is identifying the many ways in which these events or relations may be expressed in text. This has generally involved the manual analysis and, in some cases, the annotation of large qua ..."
Abstract
-
Cited by 72 (6 self)
- Add to MetaCart
In developing an Information Extraction (IE) system for a new class of events or relations, one of the major tasks is identifying the many ways in which these events or relations may be expressed in text. This has generally involved the manual analysis and, in some cases, the annotation of large
Results 1 - 10
of
2,437