• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,437
Next 10 →

Towards a Standard Upper Ontology

by Ian Niles, Adam Pease , 2001
"... The Suggested Upper Merged Ontology (SUMO) is an upper level ontology that has been proposed as a starter document for The Standard Upper Ontology Working Group, an IEEE-sanctioned working group of collaborators from the fields of engineering, philosophy, and information science. The SUMO provides d ..."
Abstract - Cited by 589 (22 self) - Add to MetaCart
definitions for general-purpose terms and acts as a foundation for more specific domain ontologies. In this paper we outline the strategy used to create the current version of the SUMO, discuss some of the challenges that we faced in constructing the ontology, and describe in detail its most general concepts

Automatically Constructing a Dictionary for Information Extraction Tasks

by Ellen Riloff , 1993
"... Knowledge-based natural language processing systems have achieved good success with certain tasks but they are often criticized because they depend on a domain-specific dictionary that requires a great deal of manual knowledge engineering. This knowledge engineering bottleneck makes knowledge-b ..."
Abstract - Cited by 263 (22 self) - Add to MetaCart
Knowledge-based natural language processing systems have achieved good success with certain tasks but they are often criticized because they depend on a domain-specific dictionary that requires a great deal of manual knowledge engineering. This knowledge engineering bottleneck makes knowledge-based

Automatically Generating Extraction Patterns from Untagged Text

by Ellen Riloff - Department of Computer Science, Graduate School of Arts and Science, New York University , 1996
"... Many corpus-based natural language processing systems rely on text corpora that have been manually annotated with syntactic or semantic tags. In particular, all previous dictionary construction systems for information extraction have used an annotated training corpus or some form of annotated input. ..."
Abstract - Cited by 373 (32 self) - Add to MetaCart
Many corpus-based natural language processing systems rely on text corpora that have been manually annotated with syntactic or semantic tags. In particular, all previous dictionary construction systems for information extraction have used an annotated training corpus or some form of annotated input

Constructing Biological Knowledge Bases by Extracting Information from Text Sources

by Mark Craven, Johan Kumlien , 1999
"... Recently, there has been much effort in making databases for molecular biology more accessible and interoperable. However, information in text form, such as MEDLINE records, remains a greatly underutilized source of biological information. We have begun a research effort aimed at automatically mappi ..."
Abstract - Cited by 265 (0 self) - Add to MetaCart
mapping information from text sources into structured representations, such as knowledge bases. Our approach to this task is to use machine-learning methods to induce routines for extracting facts from text. We describe two learning methods that we have applied to this task --- a statistical text

DBpedia -- A Crystallization Point for the Web of Data

by Christian Bizer , Jens Lehmann , Georgi Kobilarov , Sören Auer , Christian Becker , Richard Cyganiak , Sebastian Hellmann , 2009
"... The DBpedia project is a community effort to extract structured information from Wikipedia and to make this information accessible on the Web. The resulting DBpedia knowledge base currently describes over 2.6 million entities. For each of these entities, DBpedia defines a globally unique identifier ..."
Abstract - Cited by 374 (36 self) - Add to MetaCart
of information and covers domains such as geographic information, people, companies, films, music, genes, drugs, books, and scientific publications. This article describes the extraction of the DBpedia knowledge base, the current status of interlinking DBpedia with other data sources on the Web, and gives

Information extraction: Identifying protein names from biological papers

by K. Fukuda, T. Tsunoda, A. Tamura, T. Takagi - In Proceedings of the Pacific Symposium on Biocomputing '98 (PSB'98 , 1998
"... To solve the mystery of the life phenomenon, we must clarify when genes are expressed and how their products interact with each other. But since the amount of continuously updated knowledge on these interactions is massive and is only available in the form of published articles, an intelligent infor ..."
Abstract - Cited by 286 (7 self) - Add to MetaCart
information extraction (IE) system is needed. To extract these information directly from articles, the system must rstly identify the material names. However, medical and biological documents often include proper nouns newly made by the authors, and conventional methods based on domain speci c dictionaries

Learning to Construct Knowledge Bases from the World Wide Web

by Mark Craven, Dan DiPasquo, Dayne Freitag, Andrew McCallum, Tom Mitchell, Kamal Nigam, Sean Slattery , 2000
"... The World Wide Web is a vast source of information accessible to computers, but understandable only to humans. The goal of the research described here is to automatically create a computer understandable knowledge base whose content mirrors that of the World Wide Web. Such a knowledge base would ena ..."
Abstract - Cited by 242 (5 self) - Add to MetaCart
enable much more effective retrieval of Web information, and promote new uses of the Web to support knowledge-based inference and problem solving. Our approach is to develop a trainable information extraction system that takes two inputs. The first is an ontology that defines the classes (e.g., company

Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity

by William W. Cohen , 1998
"... Most databases contain "name constants" like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of heterogeneous databases has assumed that local name constants can be mapped into an appropriate global domain by norma ..."
Abstract - Cited by 247 (13 self) - Add to MetaCart
by normalization. However, in many cases, this assumption does not hold; determining if two name constants should be considered identical can require detailed knowledge of the world, the purpose of the user's query, or both. In this paper, we reject the assumption that global domains can be easily constructed

Clustering by compression

by Rudi Cilibrasi, Paul M. B. Vitányi - IEEE Transactions on Information Theory , 2005
"... Abstract—We present a new method for clustering based on compression. The method does not use subject-specific features or background knowledge, and works as follows: First, we determine a parameter-free, universal, similarity distance, the normalized compression distance or NCD, computed from the l ..."
Abstract - Cited by 297 (25 self) - Add to MetaCart
Abstract—We present a new method for clustering based on compression. The method does not use subject-specific features or background knowledge, and works as follows: First, we determine a parameter-free, universal, similarity distance, the normalized compression distance or NCD, computed from

Automatic Acquisition of Domain Knowledge for Information Extraction

by Roman Yangarber, Ralph Grishman, Pasi Tapanainen - In Proceedings of the 18th International Conference on Computational Linguistics , 2000
"... In developing an Information Extraction (IE) system for a new class of events or relations, one of the major tasks is identifying the many ways in which these events or relations may be expressed in text. This has generally involved the manual analysis and, in some cases, the annotation of large qua ..."
Abstract - Cited by 72 (6 self) - Add to MetaCart
In developing an Information Extraction (IE) system for a new class of events or relations, one of the major tasks is identifying the many ways in which these events or relations may be expressed in text. This has generally involved the manual analysis and, in some cases, the annotation of large
Next 10 →
Results 1 - 10 of 2,437
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University