Results 1  10
of
15
Hidden markov models that use predicted local structure for fold recognition: alphabets of backbone geometry
 Proteins
, 2003
"... An important problem in computational biology is predicting the structure of the large number of putative proteins discovered by genome sequencing projects. Foldrecognition methods attempt to solve the problem by relating the target proteins to known structures, searching for template proteins hom ..."
Abstract

Cited by 71 (11 self)
 Add to MetaCart
(Show Context)
An important problem in computational biology is predicting the structure of the large number of putative proteins discovered by genome sequencing projects. Foldrecognition methods attempt to solve the problem by relating the target proteins to known structures, searching for template proteins homologous to the target. Remote homologs which may have significant structural similarity are often not detectable by sequence similarities alone. To address this, we incorporated predicted local structure, a generalization of secondary structure, into twotrack profile HMMs. We did not rely on a simple helixstrandcoil definition of secondary structure,
MML clustering of multistate, Poisson, von Mises circular and Gaussian distributions
 Statistics Computing
, 2000
"... Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference ..."
Abstract

Cited by 39 (12 self)
 Add to MetaCart
(Show Context)
Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference
Analysis of ThreeDimensional Protein Images
 Journal of Arti Intelligence research
, 1997
"... A fundamental goal of research in molecular biology is to understand protein structure. Protein crystallography is currently the most successful method for determining the threedimensional (3D) conformation of a protein, yet it remains labor intensive and relies on an expert's ability to derive ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
A fundamental goal of research in molecular biology is to understand protein structure. Protein crystallography is currently the most successful method for determining the threedimensional (3D) conformation of a protein, yet it remains labor intensive and relies on an expert's ability to derive and evaluate a protein scene model. In this paper, the problem of protein structure determination is formulated as an exercise in scene analysis. A computational methodology is presented in which a 3D image of a protein is segmented into a graph of critical points. Bayesian and certainty factor approaches are described and used to analyze critical point graphs and identify meaningful substructures, such as ffhelices and fisheets. Results of applying the methodologies to protein images at low and medium resolution are reported. The research is related to approaches to representation, segmentation and classification in vision, as well as to topdown approaches to protein structure prediction. 1...
An MML Classification of Protein Structure that knows about Angles and Sequence
"... this paper we apply a Hidden Markov Model to model the structure of a collection of known proteins. This Markov classi#cation is able to take advantage of information implicit in the order of a sequence of observations and hence is better suited to modelling protein data than a classi#cation model t ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
this paper we apply a Hidden Markov Model to model the structure of a collection of known proteins. This Markov classi#cation is able to take advantage of information implicit in the order of a sequence of observations and hence is better suited to modelling protein data than a classi#cation model that assumes independence between observations. We use an Minimum Message Length #MML# information measure to evaluate our protein structure model which enables us to #nd the model best supported by the known evidence
MML mixture modelling of multistate, Poisson, von Mises circular and Gaussian distributions
 In Proc. 6th Int. Workshop on Artif. Intelligence and Statistics
, 1997
"... Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also consistent and efficient. We provide a brief overview of MML inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)), and how it has both an informationtheoretic and a Bayesian interp ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
(Show Context)
Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also consistent and efficient. We provide a brief overview of MML inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)), and how it has both an informationtheoretic and a Bayesian interpretation. We then outline how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob (Wallace and Boulton (1968), Wallace (1986), Wallace and Dowe(1994)) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components. The message length is (to within a constant) the logarithm of the posterior probability of the theory. So, the MML theory can also be regarded as the theory with the highest posterior probability. Snob currently assumes that variables are uncorrelated, and permits multivariate data from Gaussian, discrete multistate, Poisson and von Mises circular dist...
MML, HYBRID BAYESIAN NETWORK GRAPHICAL MODELS, STATISTICAL CONSISTENCY, INVARIANCE AND UNIQUENESS
"... The problem of statistical — or inductive — inference pervades a large number of human activities and a large number of (human and nonhuman) actions requiring ‘intelligence’. Human and other ‘intelligent ’ activity often entails making inductive inferences, remembering and recording observations fr ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
The problem of statistical — or inductive — inference pervades a large number of human activities and a large number of (human and nonhuman) actions requiring ‘intelligence’. Human and other ‘intelligent ’ activity often entails making inductive inferences, remembering and recording observations from which one can make
CIRCULAR CLUSTERING BY MINIMUM MESSAGE LENGTH OF PROTEIN DIHEDRAL ANGLES
, 1995
"... Early work on proteins identified the existence of helices and extended sheets in protein secondary structures, a highlevel classification which remains popular today. Using the Snob program for informationtheoretic Minimum Message Length (MML) intrinsic classification, we are able to take the pro ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
Early work on proteins identified the existence of helices and extended sheets in protein secondary structures, a highlevel classification which remains popular today. Using the Snob program for informationtheoretic Minimum Message Length (MML) intrinsic classification, we are able to take the protein dihedral angles as determined by Xray crystallography, and cluster sets of dihedral angles into groups. Previous work by Hunter and States had applied a similar Bayesian classification method, AutoClass, to protein data with site position represented by 3 Cartesian coordinates for each of the αCarbon, βCarbon and Nitrogen, totalling 9 coordinates. By using the von Mises circular distribution in the Snob program rather than the Normal distribution in the Hunter and States model, we are instead able to represent local site properties by the two dihedral angles, φ and ψ. Since each site can be modelled as having 2 degrees of freedom, this orientationinvariant dihedral angle representation of the data is more compact than that of nine highlycorrelated Cartesian coordinates. Using the informationtheoretic message length concepts discussed in the paper, such a more concise model is more likely to represent the underlying generating process from which the data comes. We report on the results of our classification, plotting the classes in (φ,ψ)space and introducing a symmetric informationtheoretic distance measure to build a minimum spanning tree between the classes. We also give a transition matrix between the classes and note the existence of three classes in the region φ ≈−1. 09 rad and ψ ≈−0. 75 rad which are close on the spanning tree and have high intertransition probabilities. These properties give rise to a tight, abundant, selfperpetuating, αhelical structure.
Motif discovery in protein structure databases
 In: Pattern Discovery in Biomolecular Data, Tools, Techniques, and Application
, 1999
"... This chapter overviews the topic of protein motif discovery. It presents current approaches to knowledge discovery, focusing on their applications to the protein domain. In general, a motif is considered an abstraction over a set of recurring patterns in a dataset. Although we are primarily concern ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
This chapter overviews the topic of protein motif discovery. It presents current approaches to knowledge discovery, focusing on their applications to the protein domain. In general, a motif is considered an abstraction over a set of recurring patterns in a dataset. Although we are primarily concerned with protein structure motifs, the chapter also considers sequence motifs and combinations of sequence/structure motifs. The research described is motivated by our need to organize and understand the rapidly growing protein databases. Discovered motifs are also useful in automating the process of structure determination from crystallographic databases. The eld of knowledge discovery is concerned with the theory and processes involved in the representation and extraction of patterns or motifs from large databases. Discovered patterns can be used to group data into meaningful classes, to summarize data or to reveal deviant entries. Motifs stored in a database can be brought to bear on dicult instances of structure prediction or determination from Xray crystallography or NMR experiments. The need for automated discovery techniques is central to the understanding and analysis
Potential Properties of Turing Machines
, 2012
"... In this paper we investigate the notion of potential properties for Turing machines, focussing especially on universality and intelligence. We consider several machine characterisations (noninteractive and interactive) and give definitions for each case, considering permanent and transitory potentia ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
In this paper we investigate the notion of potential properties for Turing machines, focussing especially on universality and intelligence. We consider several machine characterisations (noninteractive and interactive) and give definitions for each case, considering permanent and transitory potentials. From these definitions, we analyse the relation between some potential abilities, we bring out the dependency on the environment distribution and we suggest some ideas on how potential abilities can be measured.