Results 1 - 10
of
25
Learning Question Classifiers
, 2002
"... In order to respond correctly to a free form factual question given a large collection of texts, one needs to understand the question to a level that allows determining some of the constraints the question imposes on a possible answer. These constraints may include a semantic classification of the s ..."
Abstract
-
Cited by 113 (6 self)
- Add to MetaCart
In order to respond correctly to a free form factual question given a large collection of texts, one needs to understand the question to a level that allows determining some of the constraints the question imposes on a possible answer. These constraints may include a semantic classification of the sought after answer and may even suggest using different strategies when looking for and verifying a candidate answer.
Kernel Methods for Relation Extraction
, 2002
"... We present an application of kernel methods to extracting relations from unstructured natural language sources. ..."
Abstract
-
Cited by 106 (0 self)
- Add to MetaCart
We present an application of kernel methods to extracting relations from unstructured natural language sources.
On kernel methods for relational learning
- In Proc. of the International Conference on Machine Learning
, 2003
"... Kernel methods have gained a great deal of popularity in the machine learning community as a method to learn indirectly in highdimensional feature spaces. Those interested in relational learning have recently begun to cast learning from structured and relational data in terms of kernel operations. W ..."
Abstract
-
Cited by 53 (5 self)
- Add to MetaCart
Kernel methods have gained a great deal of popularity in the machine learning community as a method to learn indirectly in highdimensional feature spaces. Those interested in relational learning have recently begun to cast learning from structured and relational data in terms of kernel operations. We describe a general family of kernel functions built up from a description language of limited expressivity and use it to study the benefits and drawbacks of kernel learning in relational domains. Learning with kernels in this family directly models learning over an expanded feature space constructed using the same description language. This allows us to examine issues of time complexity in terms of learning with these and other relational kernels, and how these relate to generalization ability. The tradeoffs between using kernels in a very high dimensional implicit space versus a restricted feature space, is highlighted through two experiments, in bioinformatics and in natural language processing. 1.
Probabilistic Reasoning for Entity & Relation Recognition
, 2002
"... This paper develops a method for recognizing relations and entities in sentences, while taking mutual dependencies among them into account. E.g., the kill (Johns, Oswald) relation in: "J. V. Oswald was murdered at JFK after his assassin, K. F. Johns..." depends on identifying Oswald and Johns as pe ..."
Abstract
-
Cited by 47 (10 self)
- Add to MetaCart
This paper develops a method for recognizing relations and entities in sentences, while taking mutual dependencies among them into account. E.g., the kill (Johns, Oswald) relation in: "J. V. Oswald was murdered at JFK after his assassin, K. F. Johns..." depends on identifying Oswald and Johns as people, JFK being identified as a location, and the kill relation between Oswald and Johns; this, in turn, enforces that Oswald and Johns are people. In our
Finding advertising keywords on web pages
- In Proceedings of WWW
, 2006
"... A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of revenue supporting the web today. Despite the importance of this area, little formal, published research exists. We describe ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of revenue supporting the web today. Despite the importance of this area, little formal, published research exists. We describe a system that learns how to extract keywords from web pages for advertisement targeting. The system uses a number of features, such as term frequency of each
Statistical Relational Learning for Document Mining
, 2003
"... A major obstacle to fully integrated deployment of statistical learners is the assumption that data sits in a single table, even though most real-world databases have complex relational structures. In this paper, we introduce an integrated approach to building regression models from data stored ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
A major obstacle to fully integrated deployment of statistical learners is the assumption that data sits in a single table, even though most real-world databases have complex relational structures. In this paper, we introduce an integrated approach to building regression models from data stored in relational databases. Potential features are generated by structured search of the space of queries to the database, and then tested for inclusion in a logistic regression. We present experimental results for the task of predicting where scientific papers will be published based on relational data taken from CiteSeer. This data includes word counts in the document, frequently cited authors or papers, co-citations, publication venues of cited papers, word co-occurrences, and word counts in cited or citing documents. Our approach results in classification accuracies superior to those achieved when using classical "flat" features. Our classification task also serves as a "where to publish?" conference/journal recommendation task.
A maximum entropy approach to information extraction from semi-structured and free text
- In Proceedings of the Eighteenth National Conference on Artificial Intelligence
, 2002
"... In this paper, we present a classification-based approach towards single-slot as well as multi-slot information extraction (IE). For single-slot IE, we worked on the domain of Seminar Announcements, where each document contains information on only one seminar. For multi-slot IE, we worked on the dom ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
In this paper, we present a classification-based approach towards single-slot as well as multi-slot information extraction (IE). For single-slot IE, we worked on the domain of Seminar Announcements, where each document contains information on only one seminar. For multi-slot IE, we worked on the domain of Management Succession. For this domain, we restrict ourselves to extracting information sentence by sentence, in the same way as (Soderland 1999). Each sentence can contain information on several management succession events. By using a classification approach based on a maximum entropy framework, our system achieves higher accuracy than the best previously published results in both domains.
Bayesian information extraction network
- In Proc.18th Int. Joint Conf. Artifical Intelligence
, 2003
"... Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various aspects of language in one model. Many existing algorithms developed for learning and inference in DBNs are applicable to probabilistic language modeling. To demonstrate the potential of DBNs for natural language processing, ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various aspects of language in one model. Many existing algorithms developed for learning and inference in DBNs are applicable to probabilistic language modeling. To demonstrate the potential of DBNs for natural language processing, we employ a DBN in an information extraction task. We show how to assemble wealth of emerging linguistic instruments for shallow parsing, syntactic and semantic tagging, morphological decomposition, named entity recognition etc. in order to incrementally build a robust information extraction system. Our method outperforms previously published results on an established benchmark domain.
Specific-to-General Learning for Temporal Events with Application to Learning . . .
- JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2002
"... We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that is sufficiently expressive to represent many events yet sufficiently restrictive to support learning. We then give algorithms, along with lower and upper complexity bounds, for the subsumption and generalization problems for AMA formulas. We present a positive-examples -- only specific-to-general learning method based on these algorithms. We also present a polynomial-time -- computable "syntactic" subsumption test that implies semantic subsumption without being equivalent to it. A generalization algorithm based on syntactic subsumption can be used in place of semantic generalization to improve the asymptotic complexity of the resulting learning algorithm. Finally
Learning with feature description logics
- Proceedings of the 12th International Conference on Inductive Logic Programming
, 2002
"... Abstract. We present a paradigm for efficient learning and inference with relational data using propositional means. The paradigm utilizes description logics and concepts graphs in the service of learning relational models using efficient propositional learning algorithms. We introduce a Feature Des ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Abstract. We present a paradigm for efficient learning and inference with relational data using propositional means. The paradigm utilizes description logics and concepts graphs in the service of learning relational models using efficient propositional learning algorithms. We introduce a Feature Description Logic (FDL)- a relational (frame based) language that supports efficient inference, along with a generation function that uses inference with descriptions in the FDL to produce features suitable for use by learning algorithms. These are used within a learning framework that is shown to learn efficiently and accurately relational representations in terms of the FDL descriptions. The paradigm was designed to support learning in domains that are relational but where the amount of data and size of representation learned are very large; we exemplify it here, for clarity, on the classical ILP task of learning family relations. This paradigm provides a natural solution to the problem of learning and representing relational data; it extends and unifies several lines of works in KRR and Machine Learning in ways that provide hope for a coherent usage of learning and reasoning methods in large scale intelligent inference. 1

