## Structured Probabilistic Models of Proteins across Spatial and Fitness Landscapes (2011)

@MISC{Kamichetty11structuredprobabilistic,

author = {Hetunandan Kamichetty and Jaime Carbonell},

title = {Structured Probabilistic Models of Proteins across Spatial and Fitness Landscapes},

year = {2011}

}

representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government

4023 |
Convex Optimization
- Boyd, Vandenberghe
- 2004
(Show Context)
Citation Context ...m can be solved extremely efficiently (in linear time) using an algorithm described in Schmidt et al. [2008]. Methods based on projected gradients are guaranteed to converge to a 73stationary point [=-=Boyd and Vandenberghe, 2004-=-], and convexity ensures that this stationary point is globally optimal. In order to scale the method to significantly larger domains, we can sub-divide the structure learning problem into two steps. ... |

2491 | Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
- Lafferty
- 2001
(Show Context)
Citation Context ... defined in terms of a Boltzmann factor. That ∗ Since we use G to represent a conditional probability distribution, this is also referred as a Conditional Random Field(CRF). Since commonly used CRFs [=-=Lafferty et al., 2001-=-] are usually chain graphs, we use the more general term, MRF, to avoid confusion. 15is, φi(rφi ) = exp ( − E(xφ )) i , where xφi kBT is the set of atoms that serve as arguments to φi, and E(xφi ) is... |

2040 | Regression shrinkage and selection via the lasso
- Tibshirani
- 1996
(Show Context)
Citation Context ...del using block regularization to prevent over-fitting. Block regularization is most similar in spirit to the group Lasso [Yuan and Lin, 2006] and the multi-task Lasso [Argyriou et al., 2007]. Lasso [=-=Tibshirani, 1994-=-] is the problem of finding a linear predictor, by minimizing the squared loss of the predictor with an L1 penalty. It is well known that the shrinkage properties of the L1 penalty lead to sparse pred... |

937 |
Profile hidden Markov models
- Eddy
- 1998
(Show Context)
Citation Context ...ucture using a variety of techniques, including: (i) our algorithm, GREMLIN; (ii) the greedy algorithm of Thomas et al. [2005, 2008b], denoted GMRC method’; and (iii) 75Profile Hidden Markov Models [=-=Eddy, 1998-=-] used by Bateman et al. [2002]. We note that the GMRC method only considers edges that meet certain coupling criteria (see Thomas et al. [2005, 2008b] for details). In particular, we found that it re... |

925 | Optimizing search engines using clickthrough data
- Joachims
- 2002
(Show Context)
Citation Context ...me loss-function L between G and y on B. Many approaches have been developed for the task of learning to rank, especially in IR tasks like document24retrieval [Herbrich et al., 2000] and web-search [=-=Joachims, 2002-=-]. These tasks differ in their choice of the loss function L and the algorithms used to minimize it. While initial approaches to ranking approached the ranking problem as a large number of pair-wise c... |

811 | The Protein Data Bank - Berman, Westbrook, et al. - 2000 |

737 |
Information theory and statistical mechanics
- Jaynes
- 1957
(Show Context)
Citation Context ...ases and as n −→ ∞, S −→ ∞ and is completely unconnected to Sphysical. This problem arises in many scenarios, most notably for our purposes, in informationtheoretic treatments of statistical physics [=-=Jaynes, 1963-=-, 1968]. Fortunately, a solution to this problem is available, which to the best of our knowledge is due to E.T. Jaynes [Jaynes, 1963]. By using a measure (i.e. a possibly unnormalized probability dis... |

670 | Worst-case equilibria - Koutsoupias, Papadimitriou - 1999 |

590 | Probabilistic inference using Markov chain Monte Carlo methods - Neal - 1993 |

560 | Hidden markov models in computational biology: applications to protein modeling - Krogh - 1994 |

546 |
A novel genetic system to detect proteinprotein interactions. Nature (London
- Fields, Song
- 1989
(Show Context)
Citation Context ... of the cell; transient or persistent complexes mediate processes including regulation, signaling, transport, and catalysis. While coarse-grained, high-throughput techniques such as yeast two-hybrid [=-=Fields and Song, 1989-=-] are primarily focused on which proteins interact, finer-grained techniques based on structural analysis address questions of how and why these interactions occur. By modeling the physical interactio... |

489 | The genetical evolution of social behaviour - HAMILTON - 1964 |

470 | Graphical models, exponential families and variational inference
- Wainwright, Jordan
- 2003
(Show Context)
Citation Context ...f variables of size ∆i) that are realizable by a joint distribution. This is analogous to the common definition of a marginal polytope over pair-wise marginals commonly used in approximate inference [=-=Wainwright and Jordan, 2008-=-], a set we shall refer to as M2. M∆(G) = {µ ∈ ℜ d |∃p with marginals µi(⃗ai)} M2(G) = {µ ∈ ℜ d |∃p with marginals µi(ai, aj)∀(i, j) ∈ E} For an acyclic graph, the marginal polytope M∆ can be expresse... |

446 | The Pfam protein families database. Nucleic Acids Res - Bateman, Coin, et al. - 2004 |

439 | Constructing free-energy approximations and generalized belief propagation algorithms - Yedidia, Freeman, et al. - 2005 |

406 |
Subjectivity and Correlation in Randomized Strategies
- Aumann
- 1974
(Show Context)
Citation Context ... of a mixed strategy profile specifies that each player samples from πi independent of other players. Relaxing this requirement of independence results in equilibria called correlated equilibria (CE)[=-=Aumann, 1974-=-]. Thus, a CE is any joint distribution π over the player’s actions such that ∀i, ∀a i , a ′ i ∈ A i , ∑ π(a i , a −i )ui(a i , a −i ) ≥ ∑ π(a i , a −i )ui(a ′ i, a −i ) a−i a−i It is easy to see that... |

395 | Fusion, propagation and structuring in belief networks
- Pearl
- 1986
(Show Context)
Citation Context ...oximations introduced by statistical physicists (e.g., Bethe [1935], Kikuchi [1951], Morita [1991], Morita et al. [1994]). For example, it is now known that Pearl’s Belief Propagation (BP) algorithm [=-=Pearl, 1986-=-] is equivalent to the Bethe approximation [Bethe, 1935] of the free energy. Unless otherwise specified, we use Belief Propagation for inference in the following sections. The term ‘belief’ in both BP... |

384 | Learning to rank using gradient descent
- Burges, Shaked, et al.
- 2005
(Show Context)
Citation Context ...pproaches have shown the utility of using loss-functions based on the entire rank, or the so-called “list-wise” approaches [Cao et al., 2007, Xia et al., 2008]. Further, a “soft” approach to ranking [=-=Burges et al., 2005-=-] has allowed the use of gradient-based continuous optimization techniques instead of combinatorial optimization. We use a “list-wise” soft-ranking approach to ranking since it has been shown to have ... |

368 |
Inferring Phylogenies. Sinauer Associates
- Felsenstein
- 2004
(Show Context)
Citation Context ...n associated with a sequence-only approach to learning a statistical model for a domain family is that the correlations observed in the MSA can be inflated due to phylogeny [Pollock and Taylor, 1997, =-=Felsenstein, 2003-=-]. A pair of co-incident mutations at the root of the tree can appear as a significant dependency even though they correspond to just once co-incident mutation event. To test if this was the case with... |

343 | Hidden Markov models for detecting remote protein homologies - Karplus, Barrett, et al. - 1998 |

320 | Correlated Equilibrium as an Expression of Bayesian Rationality
- Aumann
- 1986
(Show Context)
Citation Context ..., a −i ) ∀a ∈ A, π(a) ≥ 0 ∑ a∈A π(a) = 1 CE have several properties that make it more attractive than NE: they can lead to more efficient outcomes; they can be viewed as a Bayesian alternative to NE [=-=Aumann, 1987-=-], they are easier to compute than NE (which are PPAD-complete [Daskalakis et al., 2009]). Finally, there exist natural algorithms that allow players of a game to converge to a CE[Foster and Vohra, 19... |

314 |
Large margin rank boundaries for ordinal regression
- Herbrich, Obermayer, et al.
- 2000
(Show Context)
Citation Context ... score for each model that minimizes some loss-function L between G and y on B. Many approaches have been developed for the task of learning to rank, especially in IR tasks like document24retrieval [=-=Herbrich et al., 2000-=-] and web-search [Joachims, 2002]. These tasks differ in their choice of the loss function L and the algorithms used to minimize it. While initial approaches to ranking approached the ranking problem ... |

260 | Approximating probabilistic inference in Bayesian belief networks is NP-hard
- Dagum, Luby
- 1993
(Show Context)
Citation Context ... in Eq. 3.4 is straightforward for any given configuration of the random variables. Computing the partition function in Eq. 3.5, on the other hand, is computationally intractable in the general case [=-=Dagum and Chavez, 1993-=-] because it involves sum18ming over every state. However, a number of rigorous approximation algorithms have been devised for performing inference in MRFs. Significantly, it has been shown that math... |

248 | Updating quasi-Newton matrices with limited storage - Nocedal - 1980 |

235 | The complexity of computing a Nash equilibrium - Daskalakis, Goldberg, et al. |

230 | The logic of animal conflict - Smith, J, et al. - 1973 |

227 | Zur Theorie der Gesellschaftsspiele. Zur Theorie der Gesellschaftsspiele - Neumann - 1928 |

181 | Effective energy function for proteins in solution - Lazaridis, Karplus - 1999 |

159 | Design of a novel globular protein fold with atomiclevel accuracy - Kuhlman, Dantas, et al. - 2003 |

145 | Learning to rank: from pairwise approach to listwise approach
- Cao, Qin, et al.
- 2007
(Show Context)
Citation Context ... pair-wise classifications [Herbrich et al., 2000, Joachims, 2002], recent approaches have shown the utility of using loss-functions based on the entire rank, or the so-called “list-wise” approaches [=-=Cao et al., 2007-=-, Xia et al., 2008]. Further, a “soft” approach to ranking [Burges et al., 2005] has allowed the use of gradient-based continuous optimization techniques instead of combinatorial optimization. We use ... |

128 | Tertiary templates for proteins — use of packing criteria in the enumeration of allowed sequences for different structural classes - Ponder, Richards - 1987 |

123 | Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations - Guerois, Nielsen, et al. - 2002 |

121 |
On the cut polytope
- Barahona, Mahjoub
- 1986
(Show Context)
Citation Context ...ersal must be different from at the beginning, a contradiction). The advantage of these constraints is that their violation can be detected and a violated constraint identified in graphs in polytime [=-=Barahona and Mahjoub, 1986-=-] by computing shortest paths in a related graph. Violated cycle inequalities are incorporated incrementally into the constraint set and the LP is re-solved until no more violations occur. Sontag and ... |

118 | The weighted histogram analysis method for free energy calculations on biomolecules. 1. The method - Kumar - 1992 |

113 |
Evolutionarily Conserved pathways of Energetic Connectivity in Protein Families
- Lockless, Ranganathan
- 1999
(Show Context)
Citation Context ...osteric protein. The domain, and its members have been studied extensively, in multiple studies, using a wide range of techniques ranging from computational approaches based on statistical coupling ([=-=Lockless and Ranganathan, 1999-=-]) and Molecular Dynamics simulations [Dhulesia et al., 2008], to NMR based experimental studies ([Fuentes et al., 2004]). We use the MSA from Lockless and Ranganathan [1999]. The MSA is an alignment ... |

110 | Efficiency of pseudo-likelihood estimation for simple Gaussian fields”, Biometrika 64 - Besag - 1977 |

106 | Efficient structure learning of markov networks using l1-regularization. NIPS - Lee, Ganapathi, et al. - 2006 |

91 |
A simple physical model for binding energy hot spots in protein–protein complexes
- Kortemme, Baker
- 2002
(Show Context)
Citation Context ...al energy of those atoms as defined by a molecular force field. In theory, any molecular force field can be used. We specifically use the ROSETTA potential ERosetta that ROSETTA uses in computing ∆∆G[=-=Kortemme and Baker, 2002-=-] which is composed of the following terms: • Eljatr, Eljrep, the attractive and repulsive parts of a 6 − 12 Lennard-Jones potential used to model van der Waals interactions. • Esol, the Lazardus-Karp... |

89 | An orientation-dependent hydrogen bonding potential improves prediciton of specificity and structure for proteins and protein-protein complexes - Kortemme, AV, et al. |

86 |
Correlated mutations and residue contacts in proteins
- Gobel
- 1994
(Show Context)
Citation Context ...of much interest due to its wide utility. Much of the early work focused on detecting such pairs in order to predict contacts in a protein in the absence of a solved structure [Altschuh et al., 1988, =-=Göbel et al., 1994-=-] and to perform fold recognition. The pioneering work of Lockless and Ranganathan [1999] used an approach to determine probabilistic dependencies they call SCA and observed that analyzing such patter... |

76 | High-dimensional graphical model selection using `1-regularized logistic regression
- Wainwright, Ravikumar, et al.
- 2007
(Show Context)
Citation Context ...seudolikelihood converge to the true parameters. 6.3.2 L1 Regularization The study of convex approximations to the complexity and goodness of fit metrics has received considerable attention recently [=-=Wainwright et al., 2007-=-, Lee et al., 2007b, Hofling and Tibshirani, 2009, Schmidt et al., 2008]. Of these, those based on L1 regularization are the most interesting because of their strong theoretical guarantees. In particu... |

76 | Asparagine and glutamine: Using hydrogen atom contacts in the choice of side-chain amide orientation - Word, Lovell, et al. - 1999 |

70 |
Statistical potentials extracted from protein structures: how accurate are they
- Thomas, Dill
- 1996
(Show Context)
Citation Context ...n take hours to days on real-proteins, making them infeasible for the task of in-silico Protein Structure Prediction. Faster coarse-grained methods exist, e.g., Muegge [2006], but it has been argued [=-=Thomas and Dill, 1994-=-] that they are not accurate enough. In contrast to these techniques, we estimate the integral for each bi by first discretizing ri and then performing approximate inference on a discrete Markov Rando... |

68 | A Theory of Cooperative Phenomena - Kikuchi - 1951 |

61 | Approximate inference and protein-folding
- Yanover, Weiss
- 2002
(Show Context)
Citation Context ...ds [1987], McGregor et al. [1987]. While a common use of such rotamer libraries is in performing side-chain placement, i.e. finding the single most energetically favorable side-chain conformation xr [=-=Yanover and Weiss, 2002-=-, Xu, 2005, Kingsford et al., 2005, Canutescu et al., 2003], these rotamer libraries have also been used in computing free energies and conformational entropies of protein structures [Koehl and Delaru... |

60 |
A new look at the statistical model identification Automatic Control
- Akaike
(Show Context)
Citation Context ...an Information Criterion (BIC) [Schwarz, 1978], is used to select parsimonious models and is known to be asymptotically consistent in selecting the true model. The Akaike Information Criterion (AIC) [=-=Akaike, 2003-=-], typically selects denser models than the BIC, but is known to be asymptotically consistent in selecting the model with lowest predictive error (risk). In general, they do not however select the sam... |

59 | Predicting protein structure using hidden Markov models. Proteins 1(Suppl.):134–139 - Karplus, Sjolander, et al. - 1997 |

57 |
Application of a self-consistent meanfield theory to predict protein side-chain’s conformation and estimate their conformational entropy
- Koehl, Delarue
- 1994
(Show Context)
Citation Context ... and Weiss, 2002, Xu, 2005, Kingsford et al., 2005, Canutescu et al., 2003], these rotamer libraries have also been used in computing free energies and conformational entropies of protein structures [=-=Koehl and Delarue, 1994-=-, Kamisetty et al., 2007, 2008, Lilien et al., 2005]. This approach of using a set of discrete rotameric states to compute the entropy faces a subtle problem. To understand this, let us consider an im... |

57 | Theory of Games and Economic Behavior (Princeton Univ - Neumann - 1947 |

53 |
Evolutionary Information for Specifying a Protein Fold
- Socolich
- 2005
(Show Context)
Citation Context ...etermine probabilistic dependencies they call SCA and observed that analyzing such patterns could provide insights into the allosteric behav96ior of the proteins and be used to design new sequences [=-=Socolich et al., 2005-=-]. Others have since developed similar methods [Fatakia et al., 2009, Fodor and Aldrich, 2004, Fuchs et al., 2007]. By focusing on co-variation or probabilistic dependencies between residues, such met... |