Results 1  10
of
119
A tutorial introduction to the minimum description length principle
 in Advances in Minimum Description Length: Theory and Applications. 2005
"... ..."
Sampled traffic analysis by internetexchangelevel adversaries
 In Privacy Enhancing Technologies (PET), LNCS
, 2007
"... Abstract. Existing lowlatency anonymity networks are vulnerable to traffic analysis, so location diversity of nodes is essential to defend against attacks. Previous work has shown that simply ensuring geographical diversity of nodes does not resist, and in some cases exacerbates, the risk of traffi ..."
Abstract

Cited by 74 (4 self)
 Add to MetaCart
(Show Context)
Abstract. Existing lowlatency anonymity networks are vulnerable to traffic analysis, so location diversity of nodes is essential to defend against attacks. Previous work has shown that simply ensuring geographical diversity of nodes does not resist, and in some cases exacerbates, the risk of traffic analysis by ISPs. Ensuring high autonomoussystem (AS) diversity can resist this weakness. However, ISPs commonly connect to many other ISPs in a single location, known as an Internet eXchange (IX). This paper shows that IXes are a single point where traffic analysis can be performed. We examine to what extent this is true, through a case study of Tor nodes in the UK. Also, some IXes sample packets flowing through them for performance analysis reasons, and this data could be exploited to deanonymize traffic. We then develop and evaluate Bayesian traffic analysis techniques capable of processing this sampled data. 1
Lattice duality: The origin of probability and entropy
 In press: Neurocomputing
, 2005
"... Bayesian probability theory is an inference calculus, which originates from a generalization of inclusion on the Boolean lattice of logical assertions to a degree of inclusion represented by a real number. Dual to this lattice is the distributive lattice of questions constructed from the ordered set ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
Bayesian probability theory is an inference calculus, which originates from a generalization of inclusion on the Boolean lattice of logical assertions to a degree of inclusion represented by a real number. Dual to this lattice is the distributive lattice of questions constructed from the ordered set of downsets of assertions, which forms the foundation of the calculus of inquiry—a generalization of information theory. In this paper we introduce this novel perspective on these spaces in which machine learning is performed and discuss the relationship between these results and several proposed generalizations of information theory in the literature.
Reasoning about trust using argumentation: A position paper
 In Proceedings of the Workshop on Argumentation in Multiagent Systems
, 2010
"... Abstract. Trust is a mechanism for managing the uncertainty about autonomous entities and the information they store, and so can play an important role in any decentralized system. As a result, trust has been widely studied in multiagent systems and related fields such as the semantic web. Managing ..."
Abstract

Cited by 16 (12 self)
 Add to MetaCart
(Show Context)
Abstract. Trust is a mechanism for managing the uncertainty about autonomous entities and the information they store, and so can play an important role in any decentralized system. As a result, trust has been widely studied in multiagent systems and related fields such as the semantic web. Managing information about trust involves inference with uncertain information, decision making, and dealing with commitments and the provenance of information, all areas to which systems of argumentation have been applied. Here we discuss the application of argumentation to reasoning about trust, identifying some of the components that an argumentationbased system for reasoning about trust would need to contain and sketching the work that would be required to provide such a system. 1
A Philosophical Treatise of Universal Induction
 Entropy 2011
"... Understanding inductive reasoning is a problem that has engaged mankind for thousands of years. This problem is relevant to a wide range of fields and is integral to the philosophy of science. It has been tackled by many great minds ranging from philosophers to scientists to mathematicians, and more ..."
Abstract

Cited by 16 (11 self)
 Add to MetaCart
Understanding inductive reasoning is a problem that has engaged mankind for thousands of years. This problem is relevant to a wide range of fields and is integral to the philosophy of science. It has been tackled by many great minds ranging from philosophers to scientists to mathematicians, and more recently computer scientists. In this article we argue the case for Solomonoff Induction, a formal inductive framework which combines algorithmic information theory with the Bayesian framework. Although it achieves excellent theoretical results and is based on solid philosophical foundations, the requisite technical knowledge necessary for understanding this framework has caused it to remain largely unknown and unappreciated in the wider scientific community. The main contribution of this article is to convey Solomonoff induction and its related concepts in a generally accessible form with the aim of bridging this current technical gap. In the process we examine the major historical contributions that have led to the formulation of Solomonoff Induction as well as criticisms of Solomonoff and induction in general. In particular we examine how Solomonoff induction addresses many issues that have plagued other inductive systems, such as the black ravens paradox and the confirmation problem, and compare this approach with other recent approaches.
Bayes in the sky: Bayesian inference and model selection in cosmology
 Contemp. Phys
"... The application of Bayesian methods in cosmology and astrophysics has flourished over the past decade, spurred by data sets of increasing size and complexity. In many respects, Bayesian methods have proven to be vastly superior to more traditional statistical tools, offering the advantage of higher ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
(Show Context)
The application of Bayesian methods in cosmology and astrophysics has flourished over the past decade, spurred by data sets of increasing size and complexity. In many respects, Bayesian methods have proven to be vastly superior to more traditional statistical tools, offering the advantage of higher efficiency and of a consistent conceptual basis for dealing with the problem of induction in the presence of uncertainty. This trend is likely to continue in the future, when the way we collect, manipulate and analyse observations and compare them with theoretical models will assume an even more central role in cosmology. This review is an introduction to Bayesian methods in cosmology and astrophysics and recent results in the field. I first present Bayesian probability theory and its conceptual underpinnings, Bayes ’ Theorem and the role of priors. I discuss the problem of parameter inference and its general solution, along with numerical techniques such as Monte Carlo Markov Chain methods. I then review the theory and application of Bayesian model comparison, discussing the notions of Bayesian evidence and effective model complexity, and how to compute and interpret those quantities. Recent developments in cosmological parameter extraction and Bayesian cosmological model building are summarized, highlighting the challenges that lie ahead.
Simple and Efficient Clause Subsumption with Feature Vector Indexing
 Proc. of the IJCAR2004 Workshop on Empirically Successful FirstOrder Theorem Proving
"... Abstract. This paper describes feature vector indexing, a new, nonperfect indexing method for clause subsumption. It is suitable for both forward (i.e., finding a subsuming clause in a set) and backward (finding all subsumed clauses in a set) subsumption. Moreover, it is easy to implement, but still ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
(Show Context)
Abstract. This paper describes feature vector indexing, a new, nonperfect indexing method for clause subsumption. It is suitable for both forward (i.e., finding a subsuming clause in a set) and backward (finding all subsumed clauses in a set) subsumption. Moreover, it is easy to implement, but still yields excellent performance in practice. As an added benefit, by restricting the selection of features used in the index, our technique immediately adapts to indexing modulo arbitrary AC theories with only minor loss of efficiency. Alternatively, the feature selection can be restricted to result in set subsumption. Feature vector indexing has been implemented in our equational theorem prover E, and has enabled us to integrate new simplification techniques making heavy use of subsumption. We experimentally compare the performance of the prover for a number of strategies using feature vector indexing and conventional sequential subsumption.
Predicting and understanding the stability of Gquadruplexes
 Bioinformatics
, 2009
"... Motivation: Gquadruplexes are stable fourstranded guaninerich structures that can form in DNA and RNA. They are an important component of human telomeres and play a role in the regulation of transcription and translation. The biological significance of a Gquadruplex is crucially linked with its ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Motivation: Gquadruplexes are stable fourstranded guaninerich structures that can form in DNA and RNA. They are an important component of human telomeres and play a role in the regulation of transcription and translation. The biological significance of a Gquadruplex is crucially linked with its thermodynamic stability. Hence the prediction of Gquadruplex stability is of vital interest. Results: In this paper we present a novel Bayesian prediction framework based on Gaussian process regression to determine the thermodynamic stability of previously unmeasured Gquadruplexes from the sequence information alone. We benchmark our approach on a large Gquadruplex dataset and compare our method to alternative approaches. Furthermore we propose an active learning procedure which can be used to iteratively acquire data in an optimal fashion. Lastly, we demonstrate the usefulness of our procedure on a genomewide study of quadruplexes in the human genome. Availability: A data table with the training sequences is available as supplementary material. Source code is available online. Contact:
Computational Energybased Redesign of Robust Proteins
, 2010
"... The robustness of a system is a property that pervades all aspects of Nature. The ability of a system to adapt itself to perturbations due to internal and external agents, to aging, to wear, to environmental changes is one of the driving forces of evolution and a fundamental design principle. At the ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
The robustness of a system is a property that pervades all aspects of Nature. The ability of a system to adapt itself to perturbations due to internal and external agents, to aging, to wear, to environmental changes is one of the driving forces of evolution and a fundamental design principle. At the molecular level, understanding the robustness of a protein has a great impact on the in silicon design of polypeptide chains and drugs; the chance of computationally checking the ability of a protein to preserve its structure and function in the native state can lead to the design of new compounds that can work in a living cell more effectively. Inspired by the well known robustness analysis framework used in Electronic Design Automation, we introduced a notion of robustness for proteins and two dimensionless quantities: the energetic robustness and the energetic relative entropy. We used the energetic robustness in order to quantify the yield of a protein in terms of potential energy, and to detect sensitive regions and sensitive residues in the protein, whereas we adopted the energetic relative entropy to measure the discrepancy between two potential energy distributions. Subsequently, we implemented a new robustnesscentered protein design algorithm called RobustProteinDesign (RPD); the aim of the algorithm is to discover new conformations with a specific function and with high robustness values. We performed an extensive characterization of the robustness property of many peptides, proteins, and drugs. Moreover, we found that robustness and relative entropy are conflicting objectives which constitute a tradeoff useful as design principle for new proteins and drugs. Finally, we used the RPD algorithm on the Crambin protein (1CRN); the obtained results confirm that the algorithm was able to find out a Crambinlike protein that is 23 % more robust than the wild type.
Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model
"... We present an inference algorithm that organizes observed words (tokens) into structured inflectional paradigms (types). It also naturally predicts the spelling of unobserved forms that are missing from these paradigms, and discovers inflectional principles (grammar) that generalize to wholly unobse ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
We present an inference algorithm that organizes observed words (tokens) into structured inflectional paradigms (types). It also naturally predicts the spelling of unobserved forms that are missing from these paradigms, and discovers inflectional principles (grammar) that generalize to wholly unobserved words. Our Bayesian generative model of the data explicitly represents tokens, types, inflections, paradigms, and locally conditioned string edits. It assumes that inflected word tokens are generated from an infinite mixture of inflectional paradigms (string tuples). Each paradigm is sampled all at once from a graphical model, whose potential functions are weighted finitestate transducers with languagespecific parameters to be learned. These assumptions naturally lead to an elegant empirical Bayes inference procedure that exploits Monte Carlo EM, belief propagation, and dynamic programming. Given 50–100 seed paradigms, adding a 10millionword corpus reduces prediction error for morphological inflections by up to 10%.