• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Automatically Refining the Wikipedia Infobox Ontology (2008)

Cached

  • Download as a PDF

Download Links

  • [www.cs.washington.edu]
  • [alchemy.cs.washington.edu]
  • [www.cs.washington.edu]
  • [www.cs.washington.edu]
  • [turing.cs.washington.edu]
  • [ai.cs.washington.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Fei Wu , Daniel S. Weld
Citations:43 - 7 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Wu08automaticallyrefining,
    author = {Fei Wu and Daniel S. Weld},
    title = {Automatically Refining the Wikipedia Infobox Ontology },
    year = {2008}
}

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more semantic knowledge from natural language text. But in order to realize the full power of this information, it must be situated in a cleanly-structured ontology. This paper introduces KOG, an autonomous system for refining Wikipedia’s infobox-class ontology towards this end. We cast the problem of ontology refinement as a machine learning problem and solve it using both SVMs and a more powerful joint-inference approach expressed in Markov Logic Networks. We present experiments demonstrating the superiority of the joint-inference approach and evaluating other aspects of our system. Using these techniques, we build a rich ontology, integrating Wikipedia’s infobox-class schemata with WordNet. We demonstrate how the resulting ontology may be used to enhance Wikipedia with improved query processing and other features.

Citations

673 Automatic Acquisition of Hyponyms from Large Text Corpora - Hearst - 1992
363 Markov logic networks - Richardson, Domingos - 2006
225 Faceted metadata for image search and browsing - Yee, Swearingen, et al. - 2003
205 Unsupervised named-entity extraction from the web: an experimental study - Etzioni, Cafarella, et al. - 2005
203 DBpedia: A Nucleus for a Web of Open Data - Auer, Bizer, et al.
121 Semantic taxonomy induction from heterogeneous evidence - Snow, Jurafsky, et al. - 2006
117 Towards the self-annotating web - Cimiano, Handschuh, et al. - 2004
111 Information Extraction with HMMs and Shrinkage - Freigtag, McCallum - 1999
107 chen-chuan chang; “Statistical schema matching across web query interfaces - He, Kevin - 2003
93 Industrial-Strength Schema Matching - Bernstein, Melnik, et al. - 2004
66 A content-driven reputation system for the Wikipedia - Adler, Alfaro - 2007
63 Learning Taxonomic Relations from Heterogeneous Evidence - Cimiano, Pivk, et al. - 2004
63 Sound and efficient inference with probabilistic and deterministic dependencies - Poon, Domingos - 2006
60 Gimme´ the context: Context-driven automatic semantic annotation with cpankow - Cimiano, Ladwig, et al. - 2005
59 Autonomously semantifying Wikipedia - Wu, Weld - 2007
55 Learning to match the schemas of data sources: A multistrategy approach - Doan
54 Discriminative training of Markov logic networks - Singla, Domingos - 2005
51 YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia - Suchanek, Kasneci, et al.
40 Building Concept Representations from Reusable Components - Clark, Porter - 1997
37 Naumann: Schema matching using duplicates - Bilke, F - 2005
36 Deriving a Large Scale Taxonomy from Wikipedia - Ponzetto, Strube - 2007
30 Evaluation of ontolearn, a methodology for automatic population of domain ontologies - Velardi, Navigli, et al. - 2005
27 Harvesting Wiki Consensus: Using Wikipedia Entries as Vocabulary for Knowledge Management - Hepp, Siorpaes, et al.
22 A knowledge-based search engine powered by wikipedia - Milne, Witten, et al. - 2007
21 Information extraction from Wikipedia: moving down the long tail - Wu, Hoffmann, et al. - 2008
19 Sparse information extraction: Unsupervised language models to the rescue - Downey, Schoenmackers, et al. - 2007
13 Learning concept hierarchies from text with a guided agglomerative clustering algorithm - Cimiano, Staab - 2005
13 Relation extraction from Wikipedia using subtree mining - Nguyen, Matsuo, et al. - 2007
10 Acquiring ontological relationships from wikipedia using rmrs - Herbelot, Copestake - 2006
6 2005): Web-scale taxonomy learning - SÁNCHEZ, MORENO
5 Navigating Extracted Data with Schema Discovery - Cafarella, Suciu, et al.
5 Building community wikipedias: A human-machine approach - DeRose, Chai, et al. - 2008
4 The alchemy system for statistical relational ai (Technical Report - Kok, Singla, et al. - 2005
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University