z; Department of Computer Science; University of Illinois at Urbana-Champaign
SVM HeaderParse 0.2
We study a learning problem which allows for a \fair" comparison between unsupervised learning methods|probabilistic model construction, and more traditional algorithms that directly learn a classication. The merits of each approach are intuitively clear: inducing a model is more expensive computationally, but may support a wider range of predictions. Its performance, however, will depend on how well the postulated probabilistic model ts that data. To compare the paradigms we consider a model which postulates a single binary-valued hidden variable on which all other attributes depend. In this model, nding the most likely value of any one variable (given known values for the others) reduces to testing a linear function of the observed values. We learn the model with two techniques: the standard EM algorithm, and a new algorithm we develop based on covariances. We compare these, in a controlled fashion, against an algorithm (a version of Winnow) that attempts to nd a good l...