• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Declaration (2008)

Cached

  • Download as a PDF

Download Links

  • [www.dcs.bbk.ac.uk]
  • [www.dcs.bbk.ac.uk]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Dean Williams
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Williams08declaration,
    author = {Dean Williams},
    title = {Declaration},
    year = {2008}
}

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

Information Extraction Improving the ability of computer systems to process text is a significant research challenge. Many applications are based on partially structured databases, where structured data conforming to a schema is combined with free text. Information is stored as text in these applications because the queries required are not all known in advance – allowing for text is an attempt to capture information that could be relevant in the future but cannot be anticipated when the database schema is being designed. Text is also used due to the limitations of conventional databases, where the schema cannot easily be extended as new entity types and relationships arise in the future. Information Extraction (IE) is the process of finding instances of pre-defined entity types within text, while Data Integration systems build a virtual global schema from available structured data sources. We argue that combining techniques from IE and data integration is a promising approach for supporting applications that access partially structured data: the virtual global schema and associated metadata can be used to partially configure an IE process, and the information extracted by the IE process can then be integrated into the virtual global database, supporting queries which could not otherwise be answered. In this thesis we describe the design and implementation of the Experimental System To Extract Structure from Text (ESTEST) that investigates this approach. We 3 give examples of its use and experimental results from a number of application domains.

Citations

2699 MJ: Introduction to Modern Information Retrieval - Salton, McGill - 1986
1895 A Translation Approach to Portable Ontology Specifications - Gruber - 1993
801 P.A.: A survey of approaches to automatic schema matching - Rahm, Bernstein
710 EF: A relational model of data for large shared data banks - Codd - 1998
486 Pazzani M: On the optimality of the simple Bayesian classifier under zero-one loss - Domingos - 1997
385 R.: Rdf vocabulary description language 1.0: Rdf schema. w3c - Connolly, Guha - 2002
228 Information extraction - Cowie, Lehnert - 1996
221 Database techniques for the WorldWide Web: a survey - Florescu, Levy, et al. - 1998
204 Semantic distance in wordnet: An experimental, application-oriented evaluation of five measures - Budanitsky, Hirst
193 Integration of heterogeneous databases without common domains using queries based on textual similarity - Cohen - 1998
187 Using schema matching to simplify heterogeneous data translation - Milo, Zohar - 1998
155 Duplicate record detection: A survey - Elmagarmid, Ipeirotis, et al.
126 Nodose: a tool for semi-automatically extracting structured and semistructured data from text documents - Adelberg - 1998
106 Automatic linkage of vital records - Newcombe, Kennedy, et al. - 1959
102 Jena: A Semantic Web Toolkit - McBride - 2002
72 Data integration by bi-directional schema transformation rules - McBrien, Poulovassilis - 2003
65 Answering XML queries on heterogeneous data sources - MANOLESCU, FLORESCU, et al.
56 Anaphora in Natural Language Understanding: a Survey, Lecture Notes 119 - Hirst - 1981
45 A.: Schema evolution in heterogeneous database architectures, a schema transformation approach - McBrien, Poulovassilis
43 Sesame: An Architecture for Storing and Querying RDF Data and Schema Information - Broekstra, Kampman, et al. - 2003
42 Semantics and information extraction - Appelt - 2003
41 Reviewing the design of DAML+OIL: An ontology language for the semantic web - Horrocks, Patel-Schneider, et al. - 2002
36 Coreference for NLP applications - Morton - 1997
31 A.: A semantic approach to integrating XML and structured data sources - McBrien, Poulovassilis - 2001
29 Evolving GATE to Meet New Challenges in Language Engineering - Bontcheva, Tablan, et al.
28 Using information extraction to aid the discovery of prediction rules from text - Nahm, Mooney - 2000
26 Description of the LaSIE-II System as Used for MUC-7 - Humphreys, Azzam, et al. - 1998
23 InterViso: Dealing with the complexity of federated database access - Templeton, Henley, et al. - 1995
21 The AutoMed schema integration repository - Boyd, McBrien, et al. - 2002
21 Software Architecture for Language Engineering - Cunningham - 2000
17 Towards a semantic extraction of Named Entities - Maynard, Bontcheva, et al. - 2003
15 Snowball: A Prototype System for Extracting Relations from Large Text Collections - Agichtein, Gravano, et al. - 2001
15 Paradigm merger in natural language processing - Gazdar - 1996
13 Extracting Meaningful Entities from Police Narrative Reports - Chau, Xu, et al. - 2002
11 A.: Combining data integration with natural language technology for the semantic web - Williams, Poulovassilis - 2003
10 Schema evolution in data warehousing environments - a schema transformation-based approach - Fan, Poulovassilis - 2004
6 Global query processing in the AutoMed heterogeneous database environment - Jasper - 2002
6 KIM – A Semantic Platform for Information Extraction and Retrieval - Popov, Kiryakov, et al. - 2004
6 The implementation of FDL, a functional database language - Poulovassilis - 1992
5 Editorial - Boguraev, Garigliano, et al. - 1995
5 The Automed Intermediate Query Language. Automed Working Document - Poulovassilis
4 Learning IE patterns: a terminology extraction perspective. Workshop of Event Modelling for Multilingual 174 Linking at LREC 2002. http://citeseer.ist.psu.edu/basili02learning.html - Basili, Pazienza, et al. - 2002
4 Binary-Relational Storage Structures - Frost - 1982
3 The semantic web: A new opportunity and challenge for human language technology - Bontcheva, Cunningham - 2003
2 The GATE User Guide - Cunningham, Maynard, et al. - 2002
2 Natural Language Processing in LISP - Gazdar, Mellish - 1989
2 Formal Ontology and Information - Guarino - 1998
2 A.: Enhancing database technology to better manage and exploit Partially Structured Data - King, Poulovassilis - 2000
2 A database interface for link analysis - Smith, King - 2005
1 A Semantic Web Portal with HLT Capabilities. Actes du colloque, Veille Stratgique Scientifique et Technologique (VSST2004 - Amardeilh, Francart - 2004
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University