Results 1  10
of
81
Unsupervised learning of finite mixture models
 IEEE Transactions on pattern analysis and machine intelligence
, 2002
"... AbstractÐThis paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective ªunsupervisedº is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectationmaximization ..."
Abstract

Cited by 267 (20 self)
 Add to MetaCart
AbstractÐThis paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective ªunsupervisedº is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectationmaximization (EM) algorithm, it does not require careful initialization. The proposed method also avoids another drawback of EM for mixture fitting: the possibility of convergence toward a singular estimate at the boundary of the parameter space. The novelty of our approach is that we do not use a model selection criterion to choose one among a set of preestimated candidate models; instead, we seamlessly integrate estimation and model selection in a single algorithm. Our technique can be applied to any type of parametric mixture model for which it is possible to write an EM algorithm; in this paper, we illustrate it with experiments involving Gaussian mixtures. These experiments testify for the good performance of our approach. Index TermsÐFinite mixtures, unsupervised learning, model selection, minimum message length criterion, Bayesian methods, expectationmaximization algorithm, clustering. æ 1
MML clustering of multistate, Poisson, von Mises circular and Gaussian distributions
 Statistics Computing
, 2000
"... Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference ..."
Abstract

Cited by 32 (10 self)
 Add to MetaCart
Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference
Grammar Modelbased Program Evolution
 In Proceedings of the 2004 IEEE Congress on Evolutionary Computation
, 2004
"... In Evolutionary Computation, genetic operators, such as mutation and crossover, are employed to perturb individuals to generate the next population. However these fixed, problem independent genetic operators may destroy the subsolution, usually called building blocks, instead of discovering and pres ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
In Evolutionary Computation, genetic operators, such as mutation and crossover, are employed to perturb individuals to generate the next population. However these fixed, problem independent genetic operators may destroy the subsolution, usually called building blocks, instead of discovering and preserving them. One way to overcome this problem is to build a model based on the good individuals, and sample this model to obtain the next population. There is a wide range of such work in Genetic Algorithms
Information Assurance through Kolmogorov Complexity
, 2001
"... The problem of Information Assurance is approached from the point of view of Kolmogorov Complexity and Minimum Message Length criteria. Several theoretical results are obtained, possible applications are discussed and a new metric for measuring complexity is introduced. Utilization of Kolmogorov Com ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
The problem of Information Assurance is approached from the point of view of Kolmogorov Complexity and Minimum Message Length criteria. Several theoretical results are obtained, possible applications are discussed and a new metric for measuring complexity is introduced. Utilization of Kolmogorov Complexity like metrics as conserved parameters to detect abnormal system behavior is explored. Data and process vulnerabilities are put forward as two different dimensions of vulnerability that can be discussed in terms of Kolmogorov Complexity. Finally, these results are utilized to conduct complexitybased vulnerability analysis. 1. Introduction Information security (or lack thereof) is too often dealt with after security has been lost. Back doors are opened, Trojan horses are placed, passwords are guessed and firewalls are broken down  in general, security is lost as barriers to hostile attackers are breached and one is put in the undesirable position of detecting and patching holes. In ...
Active Virtual Network Management Prediction
 In Parallel and Discrete Event Simulation Conference (PADS) '99
, 1999
"... Active Networking provides a framework in which executable code within data packets can execute upon intermediate network nodes. Active Virtual Network Management Prediction (AVNMP) provides a network prediction service that utilizes the capability of Active Networks to easily inject finegrained mo ..."
Abstract

Cited by 20 (10 self)
 Add to MetaCart
Active Networking provides a framework in which executable code within data packets can execute upon intermediate network nodes. Active Virtual Network Management Prediction (AVNMP) provides a network prediction service that utilizes the capability of Active Networks to easily inject finegrained models into the communication network to enhance network performance. The models injected into the network allow state to be predicted and propagated throughout an active network enabling the network to operate simultaneously in real time and in the future. State information such as load, security intrusion, mobile location, faults, and other state information found in typical Management Information Bases (MIB) is available for use by the management system both with current values and with values expected to exist in the future. Implementing a load prediction and CPU prediction application has experimentally validated AVNMP. AVNMP implements a distributed, active, and truly proactive network management system. Active Networking enables the implementation of new concepts utilized in AVNMP such as the ability to quickly and easily inject models into a network. In addition, Active Networking enables the ability of messages to refine their prediction as they travel through the network as well as several enhancements to the basic AVNMP algorithm, including migration of AVNMP components and reduction in overhead by means of message fusion.
Privacy issues in knowledge discovery and data mining
 In Proc. of Australian Institute of Computer Ethics Conference (AICEC99
, 1999
"... Recent developments in information technology have enabled collection and processing of vast amounts of personal data, such as criminal records, shopping habits, credit and medical history, and driving records. This information is undoubtedly very useful in many areas, including medical research, la ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Recent developments in information technology have enabled collection and processing of vast amounts of personal data, such as criminal records, shopping habits, credit and medical history, and driving records. This information is undoubtedly very useful in many areas, including medical research, law enforcement and national security. However, there is an increasing public concern about the individuals ' privacy. Privacy is commonly seen as the right of individuals to control information about themselves. The appearance of technology for Knowledge Discovery and Data Mining (KDDM) has revitalized concern about the following general privacy issues: • secondary use of the personal information, • handling misinformation, and • granulated access to personal information. They demonstrate that existing privacy laws and policies are well behind the developments in technology, and no longer offer adequate protection. We also discuss new privacy threats posed KDDM, which includes massive data collection, data warehouses, statistical analysis and deductive learning techniques. KDDM uses vast amounts of data to generate hypotheses and discover general patterns. KDDM poses the following new challenges to privacy. • stereotypes, • guarding personal data from KDDM researchers, • individuals from training sets, and • combination of patterns. We discuss the possible solutions and their impact on the quality of discovered patterns. 1
Unan algorithm for the unsupervised learning of morphology. «Natural Language Engineering
, 2006
"... This paper describes in detail an algorithm for the unsupervised learning of natural language morphology, with emphasis on challenges that are encountered in languages typologically similar to European languages. It utilizes the Minimum Description Length analysis described in Goldsmith 2001 and has ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
This paper describes in detail an algorithm for the unsupervised learning of natural language morphology, with emphasis on challenges that are encountered in languages typologically similar to European languages. It utilizes the Minimum Description Length analysis described in Goldsmith 2001 and has been implemented in software that is available for downloading and testing. 1. Scope of this paper This paper describes in detail an algorithm used for the unsupervised learning of natural language morphology which works well for European languages and other languages in which the average number of morphemes per word is not too high. 1 It has been implemented and tested in Linguistica, and is based on the theoretical principles described in Goldsmith 2001. The present paper describes that framework briefly, but the reader is referred there for a more careful development. The executable for this program, and the source code as well, is available at
Reference analysis
 In Handbook of Statistics 25
, 2005
"... This chapter describes reference analysis, a method to produce Bayesian inferential statements which only depend on the assumed model and the available data. Statistical information theory is used to define the reference prior function as a mathematical description of that situation where data would ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
This chapter describes reference analysis, a method to produce Bayesian inferential statements which only depend on the assumed model and the available data. Statistical information theory is used to define the reference prior function as a mathematical description of that situation where data would best dominate prior knowledge about the quantity of interest. Reference priors are not descriptions of personal beliefs; they are proposed as formal consensus prior functions to be used as standards for scientific communication. Reference posteriors are obtained by formal use of Bayes theorem with a reference prior. Reference prediction is achieved by integration with a reference posterior. Reference decisions are derived by minimizing a reference posterior expected loss. An information theory based loss function, the intrinsic discrepancy, may be used to derive reference procedures for conventional inference problems in scientific investigation, such as point estimation, region estimation and hypothesis testing.
Suboptimal behavior of Bayes and MDL in classification under misspecification
 COLT
, 2004
"... We show that forms of Bayesian and MDL inference that are often applied to classification problems can be inconsistent. This means that there exists a learning problem such that for all amounts of data the generalization errors of the MDL classifier and the Bayes classifier relative to the Bayesian ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
We show that forms of Bayesian and MDL inference that are often applied to classification problems can be inconsistent. This means that there exists a learning problem such that for all amounts of data the generalization errors of the MDL classifier and the Bayes classifier relative to the Bayesian posterior both remain bounded away from the smallest achievable generalization error. From a Bayesian point of view, the result can be reinterpreted as saying that Bayesian inference can be inconsistent under misspecification, even for countably infinite models. We extensively discuss the result from both a Bayesian and an MDL perspective.
Bayesian Ying Yang system, best harmony learning, and Gaussian manifold based family
 Computational Intelligence: Research Frontiers, WCCI2008 Plenary/Invited Lectures. Lecture Notes in Computer Science
"... five action circling ..."