A tutorial introduction to the minimum description length principle
 in Advances in Minimum Description Length: Theory and Applications. 2005
Sampled traffic analysis by internetexchangelevel adversaries
 In Privacy Enhancing Technologies (PET), LNCS
, 2007
Abstract

Cited by 78 (4 self)
Abstract. Existing lowlatency anonymity networks are vulnerable to traffic analysis, so location diversity of nodes is essential to defend against attacks. Previous work has shown that simply ensuring geographical diversity of nodes does not resist, and in some cases exacerbates, the risk of traffic analysis by ISPs. Ensuring high autonomoussystem (AS) diversity can resist this weakness. However, ISPs commonly connect to many other ISPs in a single location, known as an Internet eXchange (IX). This paper shows that IXes are a single point where traffic analysis can be performed. We examine to what extent this is true, through a case study of Tor nodes in the UK. Also, some IXes sample packets flowing through them for performance analysis reasons, and this data could be exploited to deanonymize traffic. We then develop and evaluate Bayesian traffic analysis techniques capable of processing this sampled data. 1
Bayes in the sky: Bayesian inference and model selection in cosmology
 Contemp. Phys
Abstract

Cited by 55 (6 self)
The application of Bayesian methods in cosmology and astrophysics has flourished over the past decade, spurred by data sets of increasing size and complexity. In many respects, Bayesian methods have proven to be vastly superior to more traditional statistical tools, offering the advantage of higher efficiency and of a consistent conceptual basis for dealing with the problem of induction in the presence of uncertainty. This trend is likely to continue in the future, when the way we collect, manipulate and analyse observations and compare them with theoretical models will assume an even more central role in cosmology. This review is an introduction to Bayesian methods in cosmology and astrophysics and recent results in the field. I first present Bayesian probability theory and its conceptual underpinnings, Bayes ’ Theorem and the role of priors. I discuss the problem of parameter inference and its general solution, along with numerical techniques such as Monte Carlo Markov Chain methods. I then review the theory and application of Bayesian model comparison, discussing the notions of Bayesian evidence and effective model complexity, and how to compute and interpret those quantities. Recent developments in cosmological parameter extraction and Bayesian cosmological model building are summarized, highlighting the challenges that lie ahead.
Lattice duality: The origin of probability and entropy
 In press: Neurocomputing
, 2005
Abstract

Cited by 31 (10 self)
Bayesian probability theory is an inference calculus, which originates from a generalization of inclusion on the Boolean lattice of logical assertions to a degree of inclusion represented by a real number. Dual to this lattice is the distributive lattice of questions constructed from the ordered set of downsets of assertions, which forms the foundation of the calculus of inquiry—a generalization of information theory. In this paper we introduce this novel perspective on these spaces in which machine learning is performed and discuss the relationship between these results and several proposed generalizations of information theory in the literature.
Confronting LemaitreTolmanBondi models with Observational Cosmology
, 802
Abstract

Cited by 21 (0 self)
Abstract. The possibility that we live in a special place in the universe, close to the centre of a large void, seems an appealing alternative to the prevailing interpretation of the acceleration of the universe in terms of a ΛCDM model with a dominant dark energy component. In this paper we confront the asymptotically flat LemaitreTolmanBondi (LTB) models with a series of observations, from Type Ia Supernovae to Cosmic Microwave Background and Baryon Acoustic Oscillations data. We propose two concrete LTB models describing a local void in which the only arbitrary functions are the radial dependence of the matter density ΩM and the Hubble expansion rate H. We find that all observations can be accommodated within 1 sigma, for our models with 4 or 5 independent parameters. The best fit models have a χ 2 very close to that of the ΛCDM model. A general Fortran program for comparing LTB models with cosmological observations, that has been used to make the parameter scan in this paper, is made public, and can be downloaded at
A Philosophical Treatise of Universal Induction
 Entropy 2011
Abstract

Cited by 19 (14 self)
Understanding inductive reasoning is a problem that has engaged mankind for thousands of years. This problem is relevant to a wide range of fields and is integral to the philosophy of science. It has been tackled by many great minds ranging from philosophers to scientists to mathematicians, and more recently computer scientists. In this article we argue the case for Solomonoff Induction, a formal inductive framework which combines algorithmic information theory with the Bayesian framework. Although it achieves excellent theoretical results and is based on solid philosophical foundations, the requisite technical knowledge necessary for understanding this framework has caused it to remain largely unknown and unappreciated in the wider scientific community. The main contribution of this article is to convey Solomonoff induction and its related concepts in a generally accessible form with the aim of bridging this current technical gap. In the process we examine the major historical contributions that have led to the formulation of Solomonoff Induction as well as criticisms of Solomonoff and induction in general. In particular we examine how Solomonoff induction addresses many issues that have plagued other inductive systems, such as the black ravens paradox and the confirmation problem, and compare this approach with other recent approaches.
Reasoning about trust using argumentation: A position paper
 In Proceedings of the Workshop on Argumentation in Multiagent Systems
, 2010
Abstract

Cited by 16 (12 self)
Abstract. Trust is a mechanism for managing the uncertainty about autonomous entities and the information they store, and so can play an important role in any decentralized system. As a result, trust has been widely studied in multiagent systems and related fields such as the semantic web. Managing information about trust involves inference with uncertain information, decision making, and dealing with commitments and the provenance of information, all areas to which systems of argumentation have been applied. Here we discuss the application of argumentation to reasoning about trust, identifying some of the components that an argumentationbased system for reasoning about trust would need to contain and sketching the work that would be required to provide such a system. 1
Simple and Efficient Clause Subsumption with Feature Vector Indexing
 Proc. of the IJCAR2004 Workshop on Empirically Successful FirstOrder Theorem Proving
Abstract

Cited by 12 (5 self)
Abstract. This paper describes feature vector indexing, a new, nonperfect indexing method for clause subsumption. It is suitable for both forward (i.e., finding a subsuming clause in a set) and backward (finding all subsumed clauses in a set) subsumption. Moreover, it is easy to implement, but still yields excellent performance in practice. As an added benefit, by restricting the selection of features used in the index, our technique immediately adapts to indexing modulo arbitrary AC theories with only minor loss of efficiency. Alternatively, the feature selection can be restricted to result in set subsumption. Feature vector indexing has been implemented in our equational theorem prover E, and has enabled us to integrate new simplification techniques making heavy use of subsumption. We experimentally compare the performance of the prover for a number of strategies using feature vector indexing and conventional sequential subsumption.
Deriving laws from ordering relations
 In press: Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Jackson Hole WY
, 2003
Abstract

Cited by 11 (7 self)
Abstract. The effect of Richard T. Cox’s contribution to probability theory was to generalize Boolean implication among logical statements to degrees of implication, which are manipulated using rules derived from consistency with Boolean algebra. These rules are known as the sum rule, the product rule and Bayes ’ Theorem, and the measure resulting from this generalization is probability. In this paper, I will describe how Cox’s technique can be further generalized to include other algebras and hence other problems in science and mathematics. The result is a methodology that can be used to generalize an algebra to a calculus by relying on consistency with order theory to derive the laws of the calculus. My goals are to clear up the mysteries as to why the same basic structure found in probability theory appears in other contexts, to better understand the foundations of probability theory, and to extend these ideas to other areas by developing new mathematics and new physics. The relevance of this methodology will be demonstrated using examples from probability theory, number theory, geometry, information theory, and quantum mechanics.
Predicting and understanding the stability of Gquadruplexes
 Bioinformatics
, 2009
Abstract

Cited by 11 (4 self)
Motivation: Gquadruplexes are stable fourstranded guaninerich structures that can form in DNA and RNA. They are an important component of human telomeres and play a role in the regulation of transcription and translation. The biological significance of a Gquadruplex is crucially linked with its thermodynamic stability. Hence the prediction of Gquadruplex stability is of vital interest. Results: In this paper we present a novel Bayesian prediction framework based on Gaussian process regression to determine the thermodynamic stability of previously unmeasured Gquadruplexes from the sequence information alone. We benchmark our approach on a large Gquadruplex dataset and compare our method to alternative approaches. Furthermore we propose an active learning procedure which can be used to iteratively acquire data in an optimal fashion. Lastly, we demonstrate the usefulness of our procedure on a genomewide study of quadruplexes in the human genome. Availability: A data table with the training sequences is available as supplementary material. Source code is available online. Contact: