Results 1 - 10
of
57
Property Testing and its connection to Learning and Approximation
"... We study the question of determining whether an unknown function has a particular property or is ffl-far from any function with that property. A property testing algorithm is given a sample of the value of the function on instances drawn according to some distribution, and possibly may query the fun ..."
Abstract
-
Cited by 371 (48 self)
- Add to MetaCart
We study the question of determining whether an unknown function has a particular property or is ffl-far from any function with that property. A property testing algorithm is given a sample of the value of the function on instances drawn according to some distribution, and possibly may query the function on instances of its choice. First, we establish some connections between property testing and problems in learning theory. Next, we focus on testing graph properties, and devise algorithms to test whether a graph has properties such as being k-colorable or having a ae-clique (clique of density ae w.r.t the vertex set). Our graph property testing algorithms are probabilistic and make assertions which are correct with high probability, utilizing only poly(1=ffl) edge-queries into the graph, where ffl is the distance parameter. Moreover, the property testing algorithms can be used to efficiently (i.e., in time linear in the number of vertices) construct partitions of the graph which corre...
The Power of Amnesia: Learning Probabilistic Automata with Variable Memory Length
- Machine Learning
, 1996
"... . We propose and analyze a distribution learning algorithm for variable memory length Markov processes. These processes can be described by a subclass of probabilistic finite automata which we name Probabilistic Suffix Automata (PSA). Though hardness results are known for learning distributions gene ..."
Abstract
-
Cited by 148 (15 self)
- Add to MetaCart
. We propose and analyze a distribution learning algorithm for variable memory length Markov processes. These processes can be described by a subclass of probabilistic finite automata which we name Probabilistic Suffix Automata (PSA). Though hardness results are known for learning distributions generated by general probabilistic automata, we prove that the algorithm we present can efficiently learn distributions generated by PSAs. In particular, we show that for any target PSA, the KL-divergence between the distribution generated by the target and the distribution generated by the hypothesis the learning algorithm outputs, can be made small with high confidence in polynomial time and sample complexity. The learning algorithm is motivated by applications in human-machine interaction. Here we present two applications of the algorithm. In the first one we apply the algorithm in order to construct a model of the English language, and use this model to correct corrupted text. In the second ...
On the Learnability and Usage of Acyclic Probabilistic Finite Automata
- JOURNAL OF COMPUTER AND SYSTEM SCIENCES
, 1995
"... We propose and analyze a distribution learning algorithm for a subclass of Acyclic Probabilistic Finite Automata (APFA). This subclass is characterized by a certain distinguishability property of the automata's states. Though hardness results are known for learning distributions generated by general ..."
Abstract
-
Cited by 59 (3 self)
- Add to MetaCart
We propose and analyze a distribution learning algorithm for a subclass of Acyclic Probabilistic Finite Automata (APFA). This subclass is characterized by a certain distinguishability property of the automata's states. Though hardness results are known for learning distributions generated by general APFAs, we prove that our algorithm can efficiently learn distributions generated by the subclass of APFAs we consider. In particular, we show that the KL-divergence between the distribution generated by the target source and the distribution generated by our hypothesis can be made arbitrarily small with high confidence in polynomial time. We present two applications of our algorithm. In the first, we show how to model cursively written letters. The resulting models are part of a complete cursive handwriting recognition system. In the second application we demonstrate how APFAs can be used to build multiplepronunciation models for spoken words. We evaluate the APFA based pronunciation models...
Efficient Learning of Typical Finite Automata from Random Walks
, 1997
"... This paper describes new and efficient algorithms for learning deterministic finite automata. Our approach is primarily distinguished by two features: (1) the adoption of an average-case setting to model the ``typical'' labeling of a finite automaton, while retaining a worst-case model for the under ..."
Abstract
-
Cited by 44 (9 self)
- Add to MetaCart
This paper describes new and efficient algorithms for learning deterministic finite automata. Our approach is primarily distinguished by two features: (1) the adoption of an average-case setting to model the ``typical'' labeling of a finite automaton, while retaining a worst-case model for the underlying graph of the automaton, along with (2) a learning model in which the learner is not provided with the means to experiment with the machine, but rather must learn solely by observing the automaton's output behavior on a random input sequence. The main contribution of this paper is in presenting the first efficient algorithms for learning nontrivial classes of automata in an entirely passive learning model. We adopt an on-line learning model in which the learner is asked to predict the output of the next state, given the next symbol of the random input sequence; the goal of the learner is to make as few prediction mistakes as possible. Assuming the learner has a means of resetting the target machine to a fixed start state, we first present an efficient algorithm that
Streaming and sublinear approximation of entropy and information distances
- In ACM-SIAM Symposium on Discrete Algorithms
, 2006
"... In most algorithmic applications which compare two distributions, information theoretic distances are more natural than standard ℓp norms. In this paper we design streaming and sublinear time property testing algorithms for entropy and various information theoretic distances. Batu et al posed the pr ..."
Abstract
-
Cited by 33 (9 self)
- Add to MetaCart
In most algorithmic applications which compare two distributions, information theoretic distances are more natural than standard ℓp norms. In this paper we design streaming and sublinear time property testing algorithms for entropy and various information theoretic distances. Batu et al posed the problem of property testing with respect to the Jensen-Shannon distance. We present optimal algorithms for estimating bounded, symmetric f-divergences (including the Jensen-Shannon divergence and the Hellinger distance) between distributions in various property testing frameworks. Along the way, we close a (log n)/H gap between the upper and lower bounds for estimating entropy H, yielding an optimal algorithm over all values of the entropy. In a data stream setting (sublinear space), we give the first algorithm for estimating the entropy of a distribution. Our algorithm runs in polylogarithmic space and yields an asymptotic constant factor approximation scheme. An integral part of the algorithm is an interesting use of an F0 (the number of distinct elements in a set) estimation algorithm; we also provide other results along the space/time/approximation tradeoff curve. Our results have interesting structural implications that connect sublinear time and space constrained algorithms. The mediating model is the random order streaming model, which assumes the input is a random permutation of a multiset and was first considered by Munro and Paterson in 1980. We show that any property testing algorithm in the combined oracle model for calculating a permutation invariant functions can be simulated in the random order model in a single pass. This addresses a question raised by Feigenbaum et al regarding the relationship between property testing and stream algorithms. Further, we give a polylog-space PTAS for estimating the entropy of a one pass random order stream. This bound cannot be achieved in the combined oracle (generalized property testing) model. 1
XPathLearner: An On-Line Self-Tuning Markov Histogram for XML Path Selectivity Estimation
, 2002
"... The extensible mark-up language (XML) is gaining widespread use as a format for data exchange and storage on the World Wide Web. ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
The extensible mark-up language (XML) is gaining widespread use as a format for data exchange and storage on the World Wide Web.
Evolutionary Trees can be Learned in Polynomial Time in the Two-State General Markov Model
- SIAM Journal on Computing
, 1998
"... The j-State General Markov Model of evolution (due to Steel) is a stochastic model concerned with the evolution of strings over an alphabet of size j . In particular, the TwoState General Markov Model of evolution generalises the well-known Cavender-FarrisNeyman model of evolution by removing the sy ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
The j-State General Markov Model of evolution (due to Steel) is a stochastic model concerned with the evolution of strings over an alphabet of size j . In particular, the TwoState General Markov Model of evolution generalises the well-known Cavender-FarrisNeyman model of evolution by removing the symmetry restriction (which requires that the probability that a `0' turns into a `1' along an edge is the same as the probability that a `1' turns into a `0' along the edge). Farach and Kannan showed how to PAClearn Markov Evolutionary Trees in the Cavender-Farris-Neyman model provided that the target tree satisfies the additional restriction that all pairs of leaves have a sufficiently high probability of being the same. We show how to remove both restrictions and thereby obtain the first polynomial-time PAC-learning algorithm (in the sense of Kearns et al.) for the general class of Two-State Markov Evolutionary Trees. Research Report RR347, Department of Computer Science, University of Wa...
A Spectral Algorithm for Learning Hidden Markov Models
"... Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard; practitioners typically resort to search heuristics (such as the Baum-Welch / EM algorithm) which suffer from ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard; practitioners typically resort to search heuristics (such as the Baum-Welch / EM algorithm) which suffer from the usual local optima issues. We prove that under a natural separation condition (roughly analogous to those considered for learning mixture models), there is an efficient and provably correct algorithm for learning HMMs. The sample complexity of the algorithm does not explicitly depend on the number of distinct (discrete) observations—it implicitly depends on this number through spectral properties of the underlying HMM. This makes the algorithm particularly applicable to settings with a large number of observations, such as those in natural language processing where the space of observation is sometimes the words in a language. The algorithm is also simple: it employs only a singular value decomposition and matrix multiplications. 1
The complexity of approximating the entropy
- SIAM Journal on Computing
"... We consider the problem of approximating the entropy of a discrete distribution under several different models of oracle access to the distribution. In the evaluation oracle model, the algorithm is given access to the explicit array of probabilities specifying the distribution. In this model, linear ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
We consider the problem of approximating the entropy of a discrete distribution under several different models of oracle access to the distribution. In the evaluation oracle model, the algorithm is given access to the explicit array of probabilities specifying the distribution. In this model, linear time in the size of the domain is both necessary and sufficient for approximating the entropy. In the generation oracle model, the algorithm has access only to independent samples from the distribution. In this ( case, we show that a γ-multiplicative approximation to the entropy can be obtained in O n (1+η)/γ2 log n time for distributions with entropy Ω(γ/η), where n is the size of the domain of the distribution and η is an arbitrarily small positive constant. We show that this model does not permit a multiplicative approximation to the entropy in general. For ( the class of distributions to which our upper bound applies, we obtain a lower bound of Ω n1/(2γ2) We next consider a combined oracle model in which the algorithm has access to both the
QUARK: Empirical assessment of automaton-based specification miners
- In WCRE
, 2006
"... Software is often built without specification. Tools to automatically extract specification from software are needed and many techniques have been proposed. One type of these specifications – temporal API specification – is often specified in the form of automaton. There has been much work on revers ..."
Abstract
-
Cited by 21 (8 self)
- Add to MetaCart
Software is often built without specification. Tools to automatically extract specification from software are needed and many techniques have been proposed. One type of these specifications – temporal API specification – is often specified in the form of automaton. There has been much work on reverse engineering or mining software temporal specification, using dynamic analysis techniques; i.e., analysis of software program traces. Unfortunately, the issues of scalability, robustness and accuracy of these techniques have not been comprehensively addressed. In this paper, we describe QUARK(QUality Assurance framewoRK) that enables assessments of the performance of a specification miner in generating temporal specification of software through traces recorded from its API interaction. QUARK requires the temporal specification produced by the miner to be expressed as an automaton. It accepts a user-defined simulator automaton and a specification miner. It produces quality assurance measures on the specification generated by the miner. Extensive experiments on 3 specification miners have been performed to demonstrate the usefulness of our proposed framework. 1

