Results 1  10
of
53,493
A gentle tutorial on the EM algorithm and its application to parameter estimation for gaussian mixture and hidden markov models
, 1997
"... We describe the maximumlikelihood parameter estimation problem and how the Expectationform of the EM algorithm as it is often given in the literature. We then develop the EM parameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) fi ..."
Abstract

Cited by 693 (4 self)
 Add to MetaCart
) finding the parameters of a hidden Markov model (HMM) (i.e., the BaumWelch algorithm) for both discrete and Gaussian mixture observation models. We derive the update equations in fairly explicit detail but we do not prove any convergence properties. We try to emphasize intuition rather than mathematical
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract

Cited by 561 (18 self)
 Add to MetaCart
as multinomial distributions over a discrete vocabulary, and the HMM parameters are set to maximize the likelihood of the observations. This paper presents a new Markovian sequence model, closely related to HMMs, that allows observations to be represented as arbitrary overlapping features (such as word
An analysis of transformations
 Journal of the Royal Statistical Society. Series B (Methodological
, 1964
"... In the analysis of data it is often assumed that observations y,, y,,...,y, are independently normally distributed with constant variance and with expectations specified by a model linear in a set of parameters 0. In this paper we make the less restrictive assumption that such a normal, homoscedasti ..."
Abstract

Cited by 1067 (3 self)
 Add to MetaCart
In the analysis of data it is often assumed that observations y,, y,,...,y, are independently normally distributed with constant variance and with expectations specified by a model linear in a set of parameters 0. In this paper we make the less restrictive assumption that such a normal
Probabilistic Principal Component Analysis
 JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B
, 1999
"... Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximumlikelihood estimation of paramet ..."
Abstract

Cited by 709 (5 self)
 Add to MetaCart
Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximumlikelihood estimation
Longitudinal data analysis using generalized linear models”.
 Biometrika,
, 1986
"... SUMMARY This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence. The estimating ..."
Abstract

Cited by 1526 (8 self)
 Add to MetaCart
SUMMARY This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence
Reflectance and texture of realworld surfaces
 ACM TRANS. GRAPHICS
, 1999
"... In this work, we investigate the visual appearance of realworld surfaces and the dependence of appearance on scale, viewing direction and illumination direction. At ne scale, surface variations cause local intensity variation or image texture. The appearance of this texture depends on both illumina ..."
Abstract

Cited by 590 (23 self)
 Add to MetaCart
(bidirectional re ectance distribution function). We simultaneously measure the BTF and BRDF of over 60 di erent rough surfaces, each observed with over 200 di erent combinations of viewing and illumination direction. The resulting BTF database is comprised of over 12,000 image textures. To enable convenient use
The Dantzig selector: statistical estimation when p is much larger than n
, 2005
"... In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ ..."
Abstract

Cited by 879 (14 self)
 Add to MetaCart
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n
High dimensional graphs and variable selection with the Lasso
 ANNALS OF STATISTICS
, 2006
"... The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a ..."
Abstract

Cited by 736 (22 self)
 Add to MetaCart
show that the proposed neighborhood selection scheme is consistent for sparse highdimensional graphs. Consistency hinges on the choice of the penalty parameter. The oracle value for optimal prediction does not lead to a consistent neighborhood estimate. Controlling instead the probability of falsely
On PowerLaw Relationships of the Internet Topology
 IN SIGCOMM
, 1999
"... Despite the apparent randomness of the Internet, we discover some surprisingly simple powerlaws of the Internet topology. These powerlaws hold for three snapshots of the Internet, between November 1997 and December 1998, despite a 45% growth of its size during that period. We show that our powerl ..."
Abstract

Cited by 1670 (70 self)
 Add to MetaCart
laws fit the real data very well resulting in correlation coefficients of 96% or higher. Our observations provide a novel perspective of the structure of the Internet. The powerlaws describe concisely skewed distributions of graph properties such as the node outdegree. In addition, these powerlaws can
On the Use of Windows for Harmonic Analysis With the Discrete Fourier Transform
 Proc. IEEE
, 1978
"... AhmwThis Pw!r mak = available a concise review of data win compromise consists of applying windows to the sampled daws pad the ^ affect On the Of in the data set, or equivalently, smoothing the spectral samples. '7 of aoise9 m the ptesence of sdroag bar The two operations to which we subject ..."
Abstract

Cited by 668 (0 self)
 Add to MetaCart
, windowing is less related to sampled windows for DFT's. HERE IS MUCH signal processing devoted to detection and estimation. Detection is the task of determiningif a specific signal set is present in an observation, while estimation is the task of obtaining the values of the parameters
Results 1  10
of
53,493