Results 1 - 10
of
11
M.: Extracting relations in social networks from the web using similarity between collective contexts
- In: Proceedings of the 5th International Semantic Web Conference (ISWC 2006). Volume 4273 of LNCS., Athens, GA, Springer (2006) 487 – 500
"... Abstract. Social networks have recently garnered considerable interest. With the intention of utilizing social networks for the Semantic Web, several studies have examined automatic extraction of social networks. However, most methods have addressed extraction of the strength of relations. Our goal ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Abstract. Social networks have recently garnered considerable interest. With the intention of utilizing social networks for the Semantic Web, several studies have examined automatic extraction of social networks. However, most methods have addressed extraction of the strength of relations. Our goal is extracting the underlying relations between entities that are embedded in social networks. To this end, we propose a method that automatically extracts labels that describe relations among entities. Fundamentally, the method clusters similar entity pairs according to their collective contexts in Web documents. The descriptive labels for relations are obtained from results of clustering. The proposed method is entirely unsupervised and is easily incorporated into existing social network extraction methods. Our method also contributes to ontology population by elucidating relations between instances in social networks. Our experiments conducted on entities in political social networks achieved clustering with high precision and recall. We extracted appropriate relation labels to represent the entities. 1
Wanderlust: Extracting Semantic Relations from Natural Language Text Using Dependency Grammar Patterns
, 2009
"... A great share of applications in modern information technology can benefit from large coverage, machine accessible knowledge bases. However, the bigger part of todays knowledge is provided in the form of unstructured data, mostly plain text. As an initial step to exploit such data, we present Wander ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A great share of applications in modern information technology can benefit from large coverage, machine accessible knowledge bases. However, the bigger part of todays knowledge is provided in the form of unstructured data, mostly plain text. As an initial step to exploit such data, we present Wanderlust, an algorithm that automatically extracts semantic relations from natural language text. The procedure uses deep linguistic patterns that are defined over the dependency grammar of sentences. Due to its linguistic nature, the method performs in an unsupervised fashion and is not restricted to any specific type of semantic relation. The applicability of the proposed approach is examined in a case study, in which it is put to the task of generating a semantic wiki from the English Wikipedia corpus. We present an exhaustive discussion about the insights obtained from this particular case study including considerations about the generality of the approach.
Coupled Temporal Scoping of Relational Facts
"... Recent research has made significant advances in automatically constructing knowledge bases by extracting relational facts (e.g., Bill Clinton-presidentOf-US) from large text corpora. Temporally scoping such relational facts in the knowledge base (i.e., determining that Bill Clinton-presidentOf-US i ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Recent research has made significant advances in automatically constructing knowledge bases by extracting relational facts (e.g., Bill Clinton-presidentOf-US) from large text corpora. Temporally scoping such relational facts in the knowledge base (i.e., determining that Bill Clinton-presidentOf-US is true only during the period 1993- 2001) is an important, but relatively unexplored problem. In this paper, we propose a joint inference framework for this task, which leverages fact-specific temporal constraints, and weak supervision in the form of a few labeled examples. Our proposed framework, CoTS (Coupled Temporal Scoping), exploits temporal containment, alignment, succession, and mutual exclusion constraints among facts from within and across relations. Our contribution is multi-fold. Firstly, while most previous research has focused on micro-reading approaches for temporal scoping, we pose it in a macroreading fashion, as a change detection in a time series of facts ’ features computed from a large number of documents. Secondly, to the best of our knowledge, there is no other work that has used joint inference for temporal scoping. We show that joint inference is effective compared to doing temporal scoping of individual facts independently. We conduct our experiments on large scale open-domain publicly available time-stamped datasets, such as English Gigaword Corpus and Google Books Ngrams, demonstrating CoTS’s effectiveness.
Applications of probabilistic constraints
, 2007
"... Relational database systems are a successful platform to manage large amounts of data, but do not cope well with uncertainty. However, the amount of uncertain data is growing at an unprecedented rate from both traditional sources (e.g. integrating enterprise data) and from next generation sources (e ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Relational database systems are a successful platform to manage large amounts of data, but do not cope well with uncertainty. However, the amount of uncertain data is growing at an unprecedented rate from both traditional sources (e.g. integrating enterprise data) and from next generation sources (e.g. information extraction). This trend has prompted the database community to investigate a promising new technique, probabilistic databases, that natively handle uncertainty. In this nascent area, it is an open question which techniques from traditional database management apply. A remarkably useful technique in standard relational databases is to allow users to enrich the semantics of their data by declaring constraints. Two traditional uses of constraints are to prevent errors while updating the data and to optimize queries. More recently, constraints provided an elegant solution to the problem of data exchange. These successes give us reason to believe that constraints will play a large role in the theory and implementation of probabilistic databases. This report proposes to generalize constraints to handle uncertainty in the data and the constraints themselves. We identify several traditional and emerging applications that are naturally modeled with probabilistic constraints. 1 1
SCAD: Collective Discovery of Attribute Values ABSTRACT
"... Search engines today offer a rich user experience, no longer restricted to“ten blue links”. For example, the query“Canon EOS Digital Camera ” returns a photo of the digital camera, and a list of suitable merchants and prices. Similar results are offered in other domains like food, entertainment, tra ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Search engines today offer a rich user experience, no longer restricted to“ten blue links”. For example, the query“Canon EOS Digital Camera ” returns a photo of the digital camera, and a list of suitable merchants and prices. Similar results are offered in other domains like food, entertainment, travel, etc. All these experiences are fueled by the availability of structured data about the entities of interest. To obtain this structured data, it is necessary to solve the following problem: given a category of entities with its schema, and a set of Web pages that mention and describe entities belonging to the category, build a structured representation for the entity under the given schema. Specifically, collect structured numerical or discrete attributes of the entities. Most previous approaches regarded this as an information extraction problem on individual documents, and made no special use of numerical attributes. In contrast, we present an end-to-end framework which leverages signals not only from the Web page context, but also from a collective analysis of all the pages corresponding to an entity, and from constraints related to the actual values within the domain. Our current implementation uses a general and flexible Integer Linear Program (ILP) to integrate all these signals into holistic decisions over all attributes. There is one ILP per entity and it is small enough to be solved in under 38 milliseconds in our experiments. We apply the new framework to a setting of significant practical importance: catalog expansion for Commerce search engines, using data from Bing Shopping. Finally, we present experiments that validate the effectiveness of the framework and its superiority to local extraction.
Semantic Labeling of Compound Nominalization in Chinese
"... This paper discusses the semantic interpretation of compound nominalizations in Chinese. We propose four coarse-grained semantic roles of the noun modifier and use a Maximum Entropy Model to label such relations in a compound nominalization. The feature functions used for the model are web-based sta ..."
Abstract
- Add to MetaCart
This paper discusses the semantic interpretation of compound nominalizations in Chinese. We propose four coarse-grained semantic roles of the noun modifier and use a Maximum Entropy Model to label such relations in a compound nominalization. The feature functions used for the model are web-based statistics acquired via role related paraphrase patterns, which are formed by a set of word instances of prepositions, support verbs, feature nouns and aspect markers. By applying a sub-linear transformation and discretization of the raw statistics, a rate of approximately 77 % is obtained for classification of the four semantic relations. 1
Refining Non-Taxonomic Relation Labels with External Structured Data to Support Ontology Learning
"... This paper presents a method to integrate external knowledge sources such as DBpedia and OpenCyc into an ontology learning system that automatically suggests labels for unknown relations in domain ontologies based on large corpora of unstructured text. The method extracts and aggregates verb vectors ..."
Abstract
- Add to MetaCart
This paper presents a method to integrate external knowledge sources such as DBpedia and OpenCyc into an ontology learning system that automatically suggests labels for unknown relations in domain ontologies based on large corpora of unstructured text. The method extracts and aggregates verb vectors from semantic relations identified in the corpus. It composes a knowledge base which consists of (i) verb centroids for known relations between domain concepts, (ii) mappings between concept pairs and the types of known relations, and (iii) ontological knowledge retrieved from external sources. Applying semantic inference and validation to this knowledge base yields a refined relation label suggestion. A formal evaluation compares the accuracy and average ranking precision of this hybrid method with the performance of methods that solely rely on corpus data and those that are only based on reasoning and external data sources.

