Results 1 - 10
of
101
Evaluating collaborative filtering recommender systems
- ACM Transactions on Information Systems
, 2004
"... © ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM ..."
Abstract
-
Cited by 365 (9 self)
- Add to MetaCart
© ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM
Markov Logic Networks
- Machine Learning
, 2006
"... Abstract. We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects ..."
Abstract
-
Cited by 363 (27 self)
- Add to MetaCart
Abstract. We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the domain, it specifies a ground Markov network containing one feature for each possible grounding of a first-order formula in the KB, with the corresponding weight. Inference in MLNs is performed by MCMC over the minimal subset of the ground network required for answering the query. Weights are efficiently learned from relational databases by iteratively optimizing a pseudo-likelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach.
High dimensional graphs and variable selection with the Lasso
- Annals of Statistics
, 2006
"... The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a ..."
Abstract
-
Cited by 232 (17 self)
- Add to MetaCart
The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs. Neighborhood selection estimates the conditional independence restrictions separately for each node in the graph and is hence equivalent to variable selection for Gaussian linear models. We show that the proposed neighborhood selection scheme is consistent for sparse high-dimensional graphs. Consistency hinges on the choice of the penalty parameter. The oracle value for optimal prediction does not lead to a consistent neighborhood estimate. Controlling instead the probability of falsely joining some distinct connectivity components of the graph, consistent estimation for sparse graphs is achieved (with exponential rates), even when the number of variables grows as the number of observations raised to an arbitrary power. 1. Introduction. Consider
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network
- IN PROCEEDINGS OF HLT-NAACL
, 2003
"... We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective ..."
Abstract
-
Cited by 181 (12 self)
- Add to MetaCart
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24% accuracy on the Penn Treebank WSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
Sparse graphical models for exploring gene expression data
- Journal of Multivariate Analysis
, 2004
"... DMS-0112069. Any opinions, findings, and conclusions or recommendations expressed in this material are ..."
Abstract
-
Cited by 98 (19 self)
- Add to MetaCart
DMS-0112069. Any opinions, findings, and conclusions or recommendations expressed in this material are
Dependency Networks for Relational Data
- In Proceedings of the 4th IEEE International Conference on Data Mining
, 2004
"... Instance independence is a critical assumption of traditional machine learning methods contradicted by many relational datasets. For example, in scientific literature datasets there are dependencies among the references of a paper. Recent work on graphical models for relational data has demonstrated ..."
Abstract
-
Cited by 55 (7 self)
- Add to MetaCart
Instance independence is a critical assumption of traditional machine learning methods contradicted by many relational datasets. For example, in scientific literature datasets there are dependencies among the references of a paper. Recent work on graphical models for relational data has demonstrated significant performance gains for models that exploit the dependencies among instances. In this paper, we present relational dependency networks (RDNs), a new form of graphical model capable of reasoning with such dependencies in a relational setting. We describe the details of RDN models and outline their strengths, most notably the ability to learn and reason with cyclic relational dependencies. We present RDN models learned on a number of real-world datasets, and evaluate the models in a classification context, showing significant performance improvements. In addition, we use synthetic data to evaluate the quality of model learning and inference procedures. 1.
On the collective classification of email speech acts
- In Proceedings of SIGIR-2005
, 2005
"... We consider classification of email messages as to whether or not they contain certain “email acts”, such as a request or a commitment. We show that exploiting the sequential correlation among email messages in the same thread can improve email-act classification. More specifically, we describe a ne ..."
Abstract
-
Cited by 52 (3 self)
- Add to MetaCart
We consider classification of email messages as to whether or not they contain certain “email acts”, such as a request or a commitment. We show that exploiting the sequential correlation among email messages in the same thread can improve email-act classification. More specifically, we describe a new textclassification algorithm based on a dependency-network based collective classification method, in which the local classifiers are maximum entropy models based on words and certain relational features. We show that statistically significant improvements over a bag-of-words baseline classifier can be obtained for some, but not all, email-act classes. Performance improvement obtained by collective classification is appears to be consistent across email acts suggested by prior speech-act theory.
Collective Classification with Relational Dependency Networks
- Journal of Machine Learning Research
, 2003
"... this paper, we present relational dependency networks (RDNs), extending recent work in dependency networks to a relational setting ..."
Abstract
-
Cited by 49 (8 self)
- Add to MetaCart
this paper, we present relational dependency networks (RDNs), extending recent work in dependency networks to a relational setting

