• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

The power of amnesia: learning probabilistic automata with variable memory length (1996)

by D Ron, Y Singer, N Tishby
Venue:Mach. Learn
Add To MetaCart

Tools

Sorted by:
Results 11 - 20 of 104
Next 10 →

Modeling Protein Families Using Probabilistic Suffix Trees

by Gill Bejerano, Golan Yona , 1999
"... We present a method for modeling protein families by means of probabilistic sux trees (PSTs). The method is based on identifying signicant patterns in a set of related protein sequences. The input sequences do not need to be aligned, nor is delineation of domain boundaries required. The method is au ..."
Abstract - Cited by 32 (5 self) - Add to MetaCart
We present a method for modeling protein families by means of probabilistic sux trees (PSTs). The method is based on identifying signicant patterns in a set of related protein sequences. The input sequences do not need to be aligned, nor is delineation of domain boundaries required. The method is automatic, and can be applied, without assuming any preliminary biological information, with surprising success. Incorporating basic biological considerations such as amino acid background probabilities, and amino acids substitution probabilities can improve the performance in some cases. The PST can serve as a predictive tool for protein sequence classication, and for detecting conserved patterns (possibly functionally or structurally important) within protein sequences. The method was tested on one of the state of the art databases of protein families, namely, the Pfam database of HMMs, with satisfactory performance. 1

Modeling system calls for intrusion detection with dynamic window sizes

by Eleazar Eskin - In Proceedings of DARPA Information Survivabilty Conference and Exposition II (DISCEX , 2001
"... We extend prior research on system call anomaly detection modeling methods for intrusion detection by incorporating dynamic window sizes. The window size is the length of the subsequence of a system call trace which is used as the basic unit for modeling program or process behavior. In this work we ..."
Abstract - Cited by 24 (7 self) - Add to MetaCart
We extend prior research on system call anomaly detection modeling methods for intrusion detection by incorporating dynamic window sizes. The window size is the length of the subsequence of a system call trace which is used as the basic unit for modeling program or process behavior. In this work we incorporate dynamic window sizes and show marked improvements in anomaly detection. We present two methods for estimating the optimal window size based on the available training data. The first method is an entropy modeling method which determines the optimal single window size for the data. The second method is a probability modeling method that takes into account context dependent window sizes. A context dependent window size model is motivated by the way that system calls are generated by processes. Sparse Markov transducers (SMTs) are used to compute the context dependent window size model. We show over actual system call traces that the entropy modeling methods lead to the optimal single window size. We also show that context dependent window sizes outperform traditional system call modeling methods. 1

Dynamic Bayesian Networks

by Kevin P. Murphy - Probabilistic Graphical Models , 2002
"... ..."
Abstract - Cited by 24 (1 self) - Add to MetaCart
Abstract not found

An Efficient Extension to Mixture Techniques for Prediction and Decision Trees

by Fernando C. Pereira, Yoram Singer - Machine Learning , 1999
"... We present an e#cient method for maintaining mixtures of prunings of a prediction or decision tree that extends the previous methods for "node-based" prunings (Buntine, 1990; Willems, Shtarkov, & Tjalkens, 1995; Helmbold & Schapire, 1997; Singer, 1997) to the larger class of edge-based prunings. The ..."
Abstract - Cited by 23 (4 self) - Add to MetaCart
We present an e#cient method for maintaining mixtures of prunings of a prediction or decision tree that extends the previous methods for "node-based" prunings (Buntine, 1990; Willems, Shtarkov, & Tjalkens, 1995; Helmbold & Schapire, 1997; Singer, 1997) to the larger class of edge-based prunings. The method includes an online weight-allocation algorithm that can be used for prediction, compression and classification. Although the set of edge-based prunings of a given tree is much larger than that of node-based prunings, our algorithm has similar space and time complexity to that of previous mixture algorithms for trees. Using the general online framework of Freund & Schapire (1997), we prove that our algorithm maintains correctly the mixture weights for edge-based prunings with any bounded loss function. We also give a similar algorithm for the logarithmic loss function with a corresponding weight-allocation algorithm. Finally, we describe experiments comparing node-based and edge-based mixture models for estimating the probability of the next word in English text, which show the advantages of edge-based models. Keywords: mixture models, decision and prediction trees, on-line learning, statistical language modeling 1.

Protein family classification using sparse markov transducers

by Eleazar Eskin, William Stafford Noble, Yoram Singer - PROC. 8TH INT. CONF. INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY , 2003
"... We present a method for classifying proteins into families based on short subsequences of amino acids using a new probabilistic model called sparse Markov transducers (SMT). We classify a protein by estimating probability distributions over subsequences of amino acids from the protein. Sparse Markov ..."
Abstract - Cited by 23 (9 self) - Add to MetaCart
We present a method for classifying proteins into families based on short subsequences of amino acids using a new probabilistic model called sparse Markov transducers (SMT). We classify a protein by estimating probability distributions over subsequences of amino acids from the protein. Sparse Markov transducers, similar to probabilistic suffix trees, estimate a probability distribution conditioned on an input sequence. SMTs generalize probabilistic suffix trees by allowing for wild-cards in the conditioning sequences. Since substitutions of amino acids are common in protein families, incorporating wild-cards into the model significantly improves classification performance. We present two models for building protein family classifiers using SMTs. As protein databases become larger, data driven learning algorithms for probabilistic models such as SMTs will require vast amounts of memory. We therefore describe and use efficient data structures to improve the memory usage of SMTs. We evaluate SMTs by building protein family classifiers using the Pfam and SCOP databases and compare our results to previously published results and state-of-the-art protein homology detection methods. SMTs outperform previous probabilistic suffix tree methods and under certain conditions perform comparably to state-of-the-art protein homology methods.

Conventional And Periodic N-Grams in the Transcription of Drum Sequences

by Jouni K. Paulus, Anssi P. Klapuri - In Proc. of IEEE International Conference on Multimedia and Expo , 2003
"... In this paper, we describe a system for transcribing polyphonic drum sequences from an acoustic signal to a symbolic representation. Low-level signal analysis is done with an acoustic model consisting of a Gaussian mixture model and a support vector machine. For higher-level modeling, periodic N-gra ..."
Abstract - Cited by 22 (7 self) - Add to MetaCart
In this paper, we describe a system for transcribing polyphonic drum sequences from an acoustic signal to a symbolic representation. Low-level signal analysis is done with an acoustic model consisting of a Gaussian mixture model and a support vector machine. For higher-level modeling, periodic N-grams are proposed to construct a "language model" for music, based on the repetitive nature of musical structure. Also, a technique for estimating relatively long N-grams is introduced. The performance of N-grams in the transcription was evaluated using a database of realistic drum sequences from different genres and yielded a performance increase of 7.6 % compared to a the use of only prior (unigram) probabilities with the acoustic model.

Adaptive Mixtures of Probabilistic Transducers

by Yoram Singer - Neural Computation , 1996
"... We describe and analyze a mixture model for supervised learning of probabilistic transducers. We devise an on-line learning algorithm that efficiently infers the structure and estimates the parameters of each probabilistic transducer in the mixture. Theoretical analysis and comparative simulations i ..."
Abstract - Cited by 19 (3 self) - Add to MetaCart
We describe and analyze a mixture model for supervised learning of probabilistic transducers. We devise an on-line learning algorithm that efficiently infers the structure and estimates the parameters of each probabilistic transducer in the mixture. Theoretical analysis and comparative simulations indicate that the learning algorithm tracks the best transducer from an arbitrarily large (possibly infinite) pool of models. We also present an application of the model for inducing a noun phrase recognizer. 1 Introduction Supervised learning of probabilistic mappings between temporal sequences is an important goal of natural data analysis and classification with a broad range of applications, including handwriting and speech recognition, natural language processing and biological sequence analysis. Research efforts in supervised learning of probabilistic mappings have been almost exclusively focused on estimating the parameters of a predefined model. For example, Giles et al. (1992) used a...

Blind construction of optimal nonlinear recursive predictors for discrete sequences

by Cosma Rohilla Shalizi - In “Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference , 2004
"... We present a new method for nonlinear prediction of discrete random sequences under minimal structural assumptions. We give a mathematical construction for optimal predictors of such processes, in the form of hidden Markov models. We then describe an algorithm, CSSR (Causal-State Splitting Reconstru ..."
Abstract - Cited by 18 (2 self) - Add to MetaCart
We present a new method for nonlinear prediction of discrete random sequences under minimal structural assumptions. We give a mathematical construction for optimal predictors of such processes, in the form of hidden Markov models. We then describe an algorithm, CSSR (Causal-State Splitting Reconstruction), which approximates the ideal predictor from data. We discuss the reliability of CSSR, its data requirements, and its performance in simulations. Finally, we compare our approach to existing methods using variablelength Markov models and cross-validated hidden Markov models, and show theoretically and experimentally that our method delivers results superior to the former and at least comparable to the latter. 1

Beyond Word N-Grams

by Fernando C. Pereira, Yoram Singer, Naftali Tishby - In Proceedings of the Third Workshop on Very Large Corpora , 1995
"... . We describe, analyze, and evaluate experimentally a new probabilistic model for word-sequence prediction in natural language based on prediction suffix trees (PSTs). By using efficient data structures, we extend the notion of PST to unbounded vocabularies. We also show how to use a Bayesian appro ..."
Abstract - Cited by 15 (1 self) - Add to MetaCart
. We describe, analyze, and evaluate experimentally a new probabilistic model for word-sequence prediction in natural language based on prediction suffix trees (PSTs). By using efficient data structures, we extend the notion of PST to unbounded vocabularies. We also show how to use a Bayesian approach based on recursive priors over all possible PSTs to efficiently maintain tree mixtures. These mixtures have provably and practically better performance than almost any single model. We evaluate the model on several corpora. The low perplexity achieved by relatively small PST mixture models suggests that they may be an advantageous alternative, both theoretically and practically, to the widely used n-gram models. 1. Introduction Finite-state methods for statistical prediction of word sequences in natural language have had an important role in language processing research since the pioneering investigations of Markov and Shannon (1951). It is clear that natural texts are not Markov proces...

A Survey of POMDP Solution Techniques

by Kevin P. Murphy , 2000
"... this paper, we assume all actions take one unit of discrete time at some (unspecied) time scale. If we allow actions to take variable lengths of time, we end up with a semi-Markov model; see e.g., [SPS99]. ..."
Abstract - Cited by 14 (0 self) - Add to MetaCart
this paper, we assume all actions take one unit of discrete time at some (unspecied) time scale. If we allow actions to take variable lengths of time, we end up with a semi-Markov model; see e.g., [SPS99].
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University