• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

The role of documents vs. queries in extracting class attributes from text,” in CIKM, (2007)

by M Pasca, V D Benjamin, N Garera
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 14
Next 10 →

Biperpedia: An Ontology for Search Applications

by Rahul Gupta, Alon Halevy, Xuezhi Wang, Steven Euijong, Whang Fei Wu
"... Search engines make significant efforts to recognize queries that can be answered by structured data and invest heavily in creating and maintaining high-precision databases. While these databases have a relatively wide coverage of entities, the number of attributes they model (e.g., GDP, CAPITAL, AN ..."
Abstract - Cited by 9 (2 self) - Add to MetaCart
Search engines make significant efforts to recognize queries that can be answered by structured data and invest heavily in creating and maintaining high-precision databases. While these databases have a relatively wide coverage of entities, the number of attributes they model (e.g., GDP, CAPITAL, ANTHEM) is relatively small. Extending the number of attributes known to the search engine can enable it to more precisely answer queries from the long and heavy tail, extract a broader range of facts from the Web, and recover the semantics of tables on the Web. We describe Biperpedia, an ontology with 1.6M (class, attribute) pairs and 67K distinct attribute names. Biperpedia extracts attributes from the query stream, and then uses the best extractions to seed attribute extraction from text. For every attribute Biperpedia saves a set of synonyms and text patterns in which it appears, thereby enabling it to recognize the attribute in more contexts. In addition to a detailed analysis of the quality of Biperpedia, we show that it can increase the number of Web tables whose semantics we can recover by more than a factor of 4 compared with Freebase. 1.
(Show Context)

Citation Context

...sticated noun-phrase recognition is required. 10. RELATED WORK We have already touched on several related works throughout the paper. We focus here on other attribute extraction efforts. Pasca et al. =-=[21, 22]-=- were the first to explore the use of query stream data to generate attributes for entities, and their main result was that the query stream yields 45% more accurate extractions of attributes than tex...

Classdriven attribute extraction

by Ting Qian, Lenhart K. Schubert, Benjamin Van Durme, Ting Qian, Lenhart Schubert - In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08 , 2008
"... All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately. ..."
Abstract - Cited by 6 (1 self) - Add to MetaCart
All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.

Enhancing Search with Structure

by Soumen Chakrabarti, Sunita Sarawagi, S. Sudarshan
"... ..."
Abstract - Cited by 6 (0 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...s denoting attributes like dimensions and crew size. The column of the table are either specified by the user or inferred by referring to an offline extracted knowledge base of attributes and classes =-=[39]-=-. Typically, such tables are constructed from noisy consolidation of information from multiple sources. So, the cells of the table contain multiple ranked plausible answers. 4.2 Query structure Severa...

Attribute extraction and scoring: A probabilistic approach.

by Taesung Lee , Zhongyuan Wang , # , Haixun Wang , Seung-Won Hwang , Republic of Postech Korea - In ICDE, , 2013
"... Abstract-Knowledge bases, which consist of concepts, entities, attributes and relations, are increasingly important in a wide range of applications. We argue that knowledge about attributes (of concepts or entities) plays a critical role in inferencing. In this paper, we propose methods to derive a ..."
Abstract - Cited by 6 (3 self) - Add to MetaCart
Abstract-Knowledge bases, which consist of concepts, entities, attributes and relations, are increasingly important in a wide range of applications. We argue that knowledge about attributes (of concepts or entities) plays a critical role in inferencing. In this paper, we propose methods to derive attributes for millions of concepts and we quantify the typicality of the attributes with regard to their corresponding concepts. We employ multiple data sources such as web documents, search logs, and existing knowledge bases, and we derive typicality scores for attributes by aggregating different distributions derived from different sources using different methods. To the best of our knowledge, ours is the first approach to integrate concept-and instance-based patterns into probabilistic typicality scores that scale to broad concept space. We have conducted extensive experiments to show the effectiveness of our approach.
(Show Context)

Citation Context

... (family resemblance) With this intuition, population is a typical attribute of a country, as such pairs are frequently observed in CB and IB lists. Further, population is a typical attribute of a country, as most country instances, such as China or Germany, share the same attribute population. This intuition can again justify using both CB and IB for quantifying P (a|c). We can observe frequency from both streams and using IB enables us to consider the resemblance across instances. In contrast, existing methods consider either one or none: [7], [10], [11], [12] consider only frequency, while [13] considers only resemblance. [14], [15] consider neither, and instead use contextual similarity of the extracted attributes with the seed attributes given a concept. In the following sections, we see how we substantialize frequency view from a CB list, and frequency and resemblance view from IB lists. B. Computing Typicality from a CB List Recall that a CB list is in the format (c, a, n(c, a)). Grouping this list by c, we can obtain a list of attributes observed about c and their frequency distribution. Given this information, typicality score P (a|c) can be straightforwardly obtained by norma...

Acquiring knowledge about human goals from search query logs

by Markus Strohmaier, Mark Kröll - INFORMATION PROCESSING AND MANAGEMENT , 2011
"... A better understanding of what motivates humans to perform certain actions is relevant for a range of research challenges including generating action sequences that implement goals (planning). A first step in this direction is the task of acquiring knowledge about human goals. In this work, we inves ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
A better understanding of what motivates humans to perform certain actions is relevant for a range of research challenges including generating action sequences that implement goals (planning). A first step in this direction is the task of acquiring knowledge about human goals. In this work, we investigate whether Search Query Logs are a viable source for extracting expressions of human goals. For this purpose, we devise an algorithm that automatically identifies queries containing explicit goals such as find home to rent in Florida. Evaluation results of our algorithm achieve useful precision/recall values. We apply the classification algorithm to two large Search Query Logs, recorded by AOL and Microsoft Research in 2006, and obtain a set of ∼110.000 queries containing explicit goals. To study the nature of human goals in Search Query Logs, we conduct qualitative, quantitative and comparative analyses. Our findings suggest that Search Query Logs (i) represent a viable source for extracting human goals, (ii) contain a great variety of human goals and (iii) contain human goals that can be employed to complement existing commonsense knowledge bases. Finally, we illustrate the potential of goal knowledge for addressing following application scenario: to refine and extend commonsense knowledge with human goals from Search Query Logs. This work is relevant for (i) knowledge engineers interested in acquiring human goals from textual corpora and constructing knowledge bases of human goals (ii) researchers interested in studying characteristics of human goals in Search Query Logs.

A Scalable Machine-Learning Approach for Semi-Structured Named Entity Recognition

by Utku Irmak, Reiner Kraft , 2010
"... Named entity recognition studies the problem of locating and classifying parts of free text into a set of predefined categories. Although extensive research has focused on the detection of person, location and organization entities, there are many other entities of interest, including phone numbers, ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Named entity recognition studies the problem of locating and classifying parts of free text into a set of predefined categories. Although extensive research has focused on the detection of person, location and organization entities, there are many other entities of interest, including phone numbers, dates, times and currencies (to name a few examples). We refer to these types of entities as semistructured named entities, since they usually follow certain syntactic formats according to some conventions, although their structure is typically not well-defined. Regular expression solutions require significant amount of manual effort and supervised machine learning approaches rely on large sets of labeled training data. Therefore, these approaches do not scale when we need to support many semi-structured entity types in many languages and regions. In this paper, we study this problem and propose a novel threelevel bootstrapping framework for the detection of semi-structured entities. We describe the proposed techniques for phone, date and time entities, and perform extensive evaluations on English, German, Polish, Swedish and Turkish documents. Despite the minimal input from the user, our approach can achieve 95 % precision and

Studying Databases of Intentions: Do Search Query Logs Capture Knowledge about Common Human Goals?

by Markus Strohmaier, Mark Kröll
"... Access to knowledge about common human goals has been found critical for realizing the vision of intelligent agents acting upon user intent on the web. Yet, the acquisition of knowledge about common human goals represents a major challenge. In a departure from existing approaches, this paper investi ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Access to knowledge about common human goals has been found critical for realizing the vision of intelligent agents acting upon user intent on the web. Yet, the acquisition of knowledge about common human goals represents a major challenge. In a departure from existing approaches, this paper investigates a novel resource for knowledge acquisition: The utilization of search query logs for this task. By relating goals contained in search query logs with goals contained in existing commonsense knowledge bases such as ConceptNet, we aim to shed light on the usefulness of search query logs for capturing knowledge about common human goals. The main contribution of this paper consists of insights generated from an empirical study comparing common human goals contained in two large search query logs (AOL and Microsoft Research) with goals contained in the commonsense knowledge base ConceptNet. The paper sketches ways how goals from search query logs could be used to address the goal acquisition and goal coverage problem related to commonsense knowledge bases.
(Show Context)

Citation Context

... as “buying a house”) that can be analyzed, studied and used for different purposes. While search query logs have been utilized successfully for knowledge acquisition in a range of different contexts =-=[18]-=-, they have not been used to capture explicit knowledge about human goals, partly because query logs pose a num1 http://battellemedia.com/archives/000063.php, last accessed on April 15, 2009ber of ch...

Instance Sense Induction from Attribute Sets

by Ricardo Martin-brualla, Enrique Alfonseca, Marius Pasca, Keith Hall, Enrique Robledo-arnuncio, Massimiliano Ciaramita
"... This paper investigates the new problem of automatic sense induction for instance names using automatically extracted attribute sets. Several clustering strategies and data sources are described and evaluated. We also discuss the drawbacks of the evaluation metrics commonly used in similar clusterin ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
This paper investigates the new problem of automatic sense induction for instance names using automatically extracted attribute sets. Several clustering strategies and data sources are described and evaluated. We also discuss the drawbacks of the evaluation metrics commonly used in similar clustering tasks. The results show improvements in most metrics with respect to the baselines, especially for polysemous instances. 1

Notes on the Acquisition of Conditional Knowledge

by Benjamin Van Durme , 2008
"... Supported by NSF grants IIS-0328849 and IIS-0535105. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the Research in Information Extraction has been overly focused on the extraction of facts concerning individuals as compared to general knowledge pe ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Supported by NSF grants IIS-0328849 and IIS-0535105. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the Research in Information Extraction has been overly focused on the extraction of facts concerning individuals as compared to general knowledge pertaining to classes of entities and events. In addition, preference has been given to simple techniques in order to enable high volume throughput. In what follows we give examples of existing work in the field of knowledge acquisition, then follow with ideas on areas for exploration beyond the current state of the art, specifically with respect to the extraction of conditional knowledge, making use of deeper linguistic analysis than is currently

1. EXTRACTION OF ATTRIBUTES

by Joseph Reisinger
"... As an alternative to previous studies on extracting class attributes from unstructured text, which consider either Web documents or query logs as the source of textual data, A bootstrapped method extracts class attributes simultaneously from both sources, using a small set of seed attributes. The me ..."
Abstract - Add to MetaCart
As an alternative to previous studies on extracting class attributes from unstructured text, which consider either Web documents or query logs as the source of textual data, A bootstrapped method extracts class attributes simultaneously from both sources, using a small set of seed attributes. The method improves extraction preci-sion and also improves attribute relevance across 40 test classes.
(Show Context)

Citation Context

...across all ranks. Extraction from Relevant Web Documents: As a specific application of combining multiple textual data sources for attribute extraction, we use attributes extracted from query logs in =-=[3]-=- as noisy supervision for finding attributes in relevant Web documents. Web documents relevant to a particular class are found by performing search queries for each instance of that class and collecti...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University