• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Mix-nets: Factored Mixtures of Gaussians in Bayesian Networks with Mixed Continuous and Discrete Variables (2000)

by S Davies, A Moore
Add To MetaCart

Tools

Sorted by:
Results 1 - 5 of 5

Disk Aware Discord Discovery: Finding Unusual Time Series in Terabyte Sized

by Dragomir Yankov Eamonn Keogh
"... The problem of finding unusual time series has recently attracted much attention, and several promising methods are now in the literature. However, virtually all proposed methods assume that the data reside in main memory. For many real-world problems this is not be the case. For example, in astrono ..."
Abstract - Cited by 11 (4 self) - Add to MetaCart
The problem of finding unusual time series has recently attracted much attention, and several promising methods are now in the literature. However, virtually all proposed methods assume that the data reside in main memory. For many real-world problems this is not be the case. For example, in astronomy, multi-terabyte time series datasets are the norm. Most current algorithms faced with data which cannot fit in main memory resort to multiple scans of the disk/tape and are thus intractable. In this work we show how one particular definition of unusual time series, the time series discord, can be discovered with a disk aware algorithm. The proposed algorithm is exact and requires only two linear scans of the disk with a tiny buffer of main memory. Furthermore, it is very simple to implement. We use the algorithm to provide further evidence of the effectiveness of the discord definition in areas as diverse as astronomy, web query mining, video surveillance, etc., and show the efficiency of our method on datasets which are many orders of magnitude larger than anything else attempted in the literature. 1.

Fast Factored Density Estimation and Compression with Bayesian Networks

by Scott Davies, John Lafferty , 2002
"... my family-- especially my father, Donald. iv Abstract Many important data analysis tasks can be addressed by formulating them as probability estimation problems. For example, a popular general approach to automatic classification problems is to learn a probabilistic model of each class from data in ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
my family-- especially my father, Donald. iv Abstract Many important data analysis tasks can be addressed by formulating them as probability estimation problems. For example, a popular general approach to automatic classification problems is to learn a probabilistic model of each class from data in which the classes are known, and then use Bayes's rule with these models to predict the correct classes of other data for which they are not known. Anomaly detection and scientific discovery tasks can often be addressed by learning probability models over possible events and then looking for events to which these models assign low probabilities. Many data compression algorithms such as Huffman coding and arithmetic coding rely on probabilistic models of the data stream in order achieve high compression rates.

Interpolating conditional density trees

by Scott Davies, Andrew Moore - A. Darwiche, N. Friedman (Eds.), Uncertainty in Artificial Intelligence , 2002
"... Joint distributions over many variables are frequently modeled by decomposing them into products of simpler, lower-dimensional conditional distributions, such as in sparsely connected Bayesian networks. However, automatically learning such models can be very computationally expensive when there are ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Joint distributions over many variables are frequently modeled by decomposing them into products of simpler, lower-dimensional conditional distributions, such as in sparsely connected Bayesian networks. However, automatically learning such models can be very computationally expensive when there are many datapoints and many continuous variables with complex nonlinear relationships, particularly when no good ways of decomposing the joint distribution are known a priori. In such situations, previous research has generally focused on the use of discretization techniques in which each continuous variable has a single discretization that is used throughout the entire network. In this paper, we present and compare a wide variety of tree-based algorithms for learning and evaluating conditional density estimates over continuous variables. These trees can be thought of as discretizations that vary according to the particular interactions being modeled; however, the density within a given leaf of the tree need not be assumed constant, and we show that such nonuniform leaf densities lead to more accurate density estimation. We have developed Bayesian network structure-learning algorithms that employ these tree-based conditional density representations, and we show that they can be used to practically learn complex joint probability models over dozens of continuous variables from thousands of datapoints. We focus on nding models that are simultaneously accurate, fast to learn, and fast to evaluate once they are learned.

Learning from Time Series in the Presence of Noise: Unsupervised and Semi-Supervised Approaches

by Dragomir Dimitrov Yankov , 2008
"... Needless to say, I would not reach this stage of graduate school if it was not for my advisor Dr. Eamonn Keogh. I have never worked with another person with so much drive and passion for what they do, and I just hope that at least part of these qualities were acquired by me too. Eamonn taught me the ..."
Abstract - Add to MetaCart
Needless to say, I would not reach this stage of graduate school if it was not for my advisor Dr. Eamonn Keogh. I have never worked with another person with so much drive and passion for what they do, and I just hope that at least part of these qualities were acquired by me too. Eamonn taught me the basic practices and knowledge a data mining researcher needs to have, but for what its worth, it is his attitude that probably made the biggest impact on me. Any single time I would talk to him, he would be positive and encouraging. Thank you, Eamonn, for being there for me and for the rest of your students! I would like to thank my dissertation committee- Dr. Vassilis Tsotras and Dr. Stefano Lonardi. Vassilis sent me my acceptance letter exactly five years ago promising that I will enjoy the atmosphere in UC Riverside. I really did! Stefano was there for my most important publication- the first one. From him I learned that every detail matters, that every word needs to be accurately placed. I had three great internships with Yahoo!. I worked with incredible people to whom I am greatly thankful. The first summer my mentor was Dr. Dennis DeCoste. Dennis inspired many of my subsequent interests, such as ensemble learning and support vector

BMC Systems Biology BioMed Central Research article Inference of gene pathways using mixture Bayesian networks

by Younhee Ko, Chengxiang Zhai, Ra Rodriguez-zas , 2008
"... This is an Open Access article distributed under the terms of the Creative Commons Attribution License ..."
Abstract - Add to MetaCart
This is an Open Access article distributed under the terms of the Creative Commons Attribution License
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University