Results 1 
7 of
7
Model Selection by Normalized Maximum Likelihood
, 2005
"... The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a data set by extracting useful information in the data apart from random noise. The goal of model selection is to identify the model, from a set of candidate models, that permits the shortest description length (code) of the data. Since Rissanen originally formalized the problem using the crude ‘twopart code ’ MDL method in the 1970s, many significant strides have been made, especially in the 1990s, with the culmination of the development of the refined ‘universal code’ MDL method, dubbed Normalized Maximum Likelihood (NML). It represents an elegant solution to the model selection problem. The present paper provides a tutorial review on these latest developments with a special focus on NML. An application example of NML in cognitive modeling is also provided.
A simple method for generating additive clustering models with limited complexity
 Machine Learning
, 2002
"... Abstract. Additive clustering was originally developed within cognitive psychology to enable the development of featural models of human mental representation. The representational flexibility of additive clustering, however, suggests its more general application to modeling complicated relationship ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Abstract. Additive clustering was originally developed within cognitive psychology to enable the development of featural models of human mental representation. The representational flexibility of additive clustering, however, suggests its more general application to modeling complicated relationships between objects in nonpsychological domains of interest. This paper describes, demonstrates, and evaluates a simple method for learning additive clustering models, based on the combinatorial optimization approach known as PopulationBased Incremental Learning. The performance of this new method is shown to be comparable with previously developed methods over a set of ‘benchmark ’ data sets. In addition, the method developed here has the potential, by using a Bayesian analysis of model complexity that relies on an estimate of data precision, to determine the appropriate number of clusters to include in a model.
Common and Distinctive Features in Stimulus Similarity: A Modified Version of the Contrast Model
, 2002
"... Featural representations of similarity data assume that people represent stimuli in terms of a set of discrete properties. We consider the differences in featu al representations that arise from making fo u di#erent assu;LK' ns abo u how similarity ismeasu)q' Three of these similarity models  ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Featural representations of similarity data assume that people represent stimuli in terms of a set of discrete properties. We consider the differences in featu al representations that arise from making fo u di#erent assu;LK' ns abo u how similarity ismeasu)q' Three of these similarity models  the common featu2L model, the distinctive featu es model, and Tversky's seminal contrast model  have been considered previoued . The other model is new, and modifies the contrast model byassu ming that each individu al featu re only ever acts as a common or distinctive feature. Each of the four models is tested on previou sly examined similarity data, relating to kinship terms, and on a new data set, relating to faces. In fitting the models, we use the Geometric Complexity Criterion to balance the competing demands of datafit and model complexity. The resuq2 show that both common and distinctive features are important for stimuim representation, and we argue that the modified contrast model combines these two components in a more effective and interpretable way than Tversky's original formulation.
Representing Stimulus Similarity
, 2002
"... v Declaration .................................... ix Acknowledgements................................ xi 1Prelude 1 TheVeryIdeaofRepresentation......................... 2 TypesofSimilarity ................................ 8 IsSimilarityIndeterminate? ........................... 11 TheRoleofS ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
v Declaration .................................... ix Acknowledgements................................ xi 1Prelude 1 TheVeryIdeaofRepresentation......................... 2 TypesofSimilarity ................................ 8 IsSimilarityIndeterminate? ........................... 11 TheRoleofSimilarityinCognition....................... 11 Summary&GeneralDiscussion......................... 14 2 Theories of Similarity 17 SimilarityDataSets................................ 17 SpatialRepresentation .............................. 21 FeaturalRepresentation.............................. 31 TreeRepresentation................................ 40 NetworkRepresentation ............................. 47 AlignmentBasedSimilarityModels....................... 48 TransformationalSimilarityModels ....................... 50 Summary&GeneralDiscussion......................... 54 i 3 On Representational Complexity 55 ApproachestoModelSelection ......................... 57 ChoosinganAdditiveClusteringRepresentation ................ 67 ChoosinganAdditiveTreeRepresentation ................... 82 ChoosingaSpatialRepresentation........................ 94 Summary&GeneralDiscussion......................... 95 4 Featural Representation 97 AMenagerieofFeaturalModels......................... 98 ClusteringModels.................................104 GeometricComplexityCriteria..........................106 AlgorithmsforFittingFeaturalModels .....................107 MonteCarloStudyI:DotheAlgorithmsWork? ................109 RepresentationsofKinshipTerms ........................117 MonteCarloStudyII:Complexity........................122 ExperimentI:Faces................................125 ExperimentII:Countries .............................1...
Clustering using the contrast model
 In
, 2001
"... An algorithm is developed for generating featural representations from similarity data using Tversky’s (1977) Contrast Model. Unlike previous additive clustering approaches, the algorithm fits a representational model that allows for stimulus similarity to be measured in terms of both common and dis ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
An algorithm is developed for generating featural representations from similarity data using Tversky’s (1977) Contrast Model. Unlike previous additive clustering approaches, the algorithm fits a representational model that allows for stimulus similarity to be measured in terms of both common and distinctive features. The important issue of striking an appropriate balance between data fit and representational complexity is addressed through the use of the Geometric Complexity Criterion to guide model selection. The ability of the algorithm to recover known featural representations from noisy data is tested, and it is also applied to real data measuring the similarity of kinship terms.
An application of minimum description length clustering to partitioning learning curves
 Proceedings of the 2005 IEEE International Symposium on Information Theory
, 2005
"... Abstract — We apply a Minimum Description Length–based clustering technique to the problem of partitioning a set of learning curves. The goal is to partition experimental data collected from different sources into groups of sources that are statistically the same. We solve this problem by defining s ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract — We apply a Minimum Description Length–based clustering technique to the problem of partitioning a set of learning curves. The goal is to partition experimental data collected from different sources into groups of sources that are statistically the same. We solve this problem by defining statistical models for the data generating processes, then partitioning them using the Normalized Maximum Likelihood criterion. Unlike many alternative model selection methods, this approach which is optimal (in a minimax coding sense) for data of any sample size. We present an application of the method to the cognitive modeling problem of partitioning of human learning curves for different categorization tasks. I.