Results 11  20
of
74
Efficient reconstruction of haplotype structure via perfect phylogeny
 Journal of Bioinformatics and Computational Biology
, 2003
"... Each person’s genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person’s genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a h ..."
Abstract

Cited by 68 (10 self)
 Add to MetaCart
Each person’s genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person’s genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases. The haplotype mapping project, a successor to the human genome project, seeks to determine the common haplotypes in the human population. Since experimental determination of a person’s genotype is less expensive than determining its component haplotypes, algorithms are required for computing haplotypes from genotypes. Two observations aid in this process: first, the human genome contains short blocks within which only a few different haplotypes occur; second, as suggested by Gusfield, it is reasonable to assume that the haplotypes observed within a block have evolved according to a perfect phylogeny, in which at most one mutation event has occurred at any site, and no recombination occurred at the given region. We present a simple and efficient polynomialtime algorithm for inferring haplotypes from the genotypes of a set of individuals assuming a perfect phylogeny. Using a reduction to 2SAT we extend this algorithm to handle constraints that apply when we have genotypes from both parents and child. We also present a hardness result for the problem of removing the minimum number of individuals from a population to ensure that the genotypes of the remaining individuals are consistent with a perfect phylogeny. Our algorithms have been tested on real data and give biologically meaningful results. Our webserver
UnitWalk: A new SAT solver that uses local search guided by unit clause elimination
, 2002
"... In this paper we present a new randomized algorithm for SAT, i.e., the satisfiability problem for Boolean formulas in conjunctive normal form. Despite its simplicity, this algorithm performs well on many common benchmarks ranging from graph coloring problems to microprocessor verification. ..."
Abstract

Cited by 63 (1 self)
 Add to MetaCart
In this paper we present a new randomized algorithm for SAT, i.e., the satisfiability problem for Boolean formulas in conjunctive normal form. Despite its simplicity, this algorithm performs well on many common benchmarks ranging from graph coloring problems to microprocessor verification.
Learning to reason
 Journal of the ACM
, 1994
"... Abstract. We introduce a new framework for the study of reasoning. The Learning (in order) to Reason approach developed here views learning as an integral part of the inference process, and suggests that learning and reasoning should be studied together. The Learning to Reason framework combines the ..."
Abstract

Cited by 57 (24 self)
 Add to MetaCart
Abstract. We introduce a new framework for the study of reasoning. The Learning (in order) to Reason approach developed here views learning as an integral part of the inference process, and suggests that learning and reasoning should be studied together. The Learning to Reason framework combines the interfaces to the world used by known learning models with the reasoning task and a performance criterion suitable for it. In this framework, the intelligent agent is given access to its favorite learning interface, and is also given a grace period in which it can interact with this interface and construct a representation KB of the world W. The reasoning performance is measured only after this period, when the agent is presented with queries � from some query language, relevant to the world, and has to answer whether W implies �. The approach is meant to overcome the main computational difficulties in the traditional treatment of reasoning which stem from its separation from the “world”. Since the agent interacts with the world when constructing its knowledge representation it can choose a representation that is useful for the task at hand. Moreover, we can now make explicit the dependence of the reasoning performance on the environment the agent interacts with. We show how previous results from learning theory and reasoning fit into this framework and
The comparative linguistics of knowledge representation
 In Proc. of IJCAI’95
, 1995
"... We develop a methodology for comparing knowledge representation formalisms in terms of their "representational succinctness, " that is, their ability to express knowledge situations relatively efficiently. We use this framework for comparing many important formalisms for knowledge base representatio ..."
Abstract

Cited by 55 (2 self)
 Add to MetaCart
We develop a methodology for comparing knowledge representation formalisms in terms of their "representational succinctness, " that is, their ability to express knowledge situations relatively efficiently. We use this framework for comparing many important formalisms for knowledge base representation: propositional logic, default logic, circumscription, and model preference defaults; and, at a lower level, Horn formulas, characteristic models, decision trees, disjunctive normal form, and conjunctive normal form. We also show that adding new variables improves the effective expressibility of certain knowledge representation formalisms. 1
Unsatisfied variables in local search
 HYBRID PROBLEMS, HYBRID SOLUTIONS, TENTH BIENNIAL CONFERENCE ON AI AND COGNITIVE SCIENCE, IOS
, 1995
"... Several local search algorithms for propositional satis ability havebeen proposed which can solve hard random problems beyond the range of conventional backtracking procedures. In this paper, we explore the impact of focusing search in these procedures on the "unsatisfied variables"; that is, those ..."
Abstract

Cited by 45 (2 self)
 Add to MetaCart
Several local search algorithms for propositional satis ability havebeen proposed which can solve hard random problems beyond the range of conventional backtracking procedures. In this paper, we explore the impact of focusing search in these procedures on the "unsatisfied variables"; that is, those variables which appear in clauses which are not yet satisfied. For random problems, we show that such a focus reduces the sensitivity to input parameters. We also observe a simple scaling law in performance. For nonrandom problems, we showthat whilst this focus can improve performance, many problems remain difficult. We speculate that such problems will remain hard for local search unless constraint propagation techniques can be combined with hillclimbing.
Large scale reconstruction of haplotypes from genotype data
 In Proc. RECOMB’03
, 2003
"... Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which are mutations at a single nucleotide position. To characterize an individual’s variation, we must de ..."
Abstract

Cited by 45 (2 self)
 Add to MetaCart
Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which are mutations at a single nucleotide position. To characterize an individual’s variation, we must determine an individual’s haplotype or which nucleotide base occurs at each position of these common SNPs for each chromosome. In this paper, we present results for a highly accurate method for haplotype resolution from genotype data. Our method leverages a new insight into the underlying structure of haplotypes which shows that SNPs are organized in highly correlated “blocks”. The majority of individuals have one of about four common haplotypes in each block. Our method partitions the SNPs into blocks and for each block, we predict the common haplotypes and each individual’s haplotype. We evaluate our method over biological data. Our method predicts the common haplotypes perfectly and has a very low error rate (0.47%) when taking into account the predictions for the uncommon haplotypes. Our method is extremely efficient compared to previous methods, (a matter of seconds where previous methods needed hours). Its efficiency allows us to find the block partition of the haplotypes, to cope with missing data and to work with large data sets such as genotypes for thousands of SNPs for hundreds of individuals. The algorithm is available via webserver
Using Deep Structure to Locate Hard Problems
, 1992
"... One usually writes A.I. programs to be used on a range of examples which, although similar in kind, differ in detail. This paper shows how to predict where, in a space of problem instances, the hardest problems are to be found and where the fluctuations in difficulty are greatest. Our key insig ..."
Abstract

Cited by 41 (3 self)
 Add to MetaCart
One usually writes A.I. programs to be used on a range of examples which, although similar in kind, differ in detail. This paper shows how to predict where, in a space of problem instances, the hardest problems are to be found and where the fluctuations in difficulty are greatest. Our key insight is to shift emphasis from modelling sophisticated algorithms directly to modelling a search space which captures their principal effects. This allows us to analyze complex A.I. problems in a simple and intuitive way. We present a sample analysis, compare our model's quantitative predictions with data obtained independently and describe how to exploit the results to estimate the value of preprocessing. Finally, we circumscribe the kind problems to which the methodology is suited. Introduction The qualitative existence of abrupt changes in computational cost has been predicted theoretically in (Purdom 1983, Franco & Paull 1983, Huberman & Hogg 1987) and observed empirically in (Pa...
Backbone Fragility and the Local Search Cost Peak
 Journal of Artificial Intelligence Research
, 2000
"... The local search algorithm WSat is one of the most successful algorithms for solving the satisfiability (SAT) problem. It is notably e#ective at solving hard Random 3SAT instances near the socalled `satisfiability threshold', but still shows a peak in search cost near the threshold and large va ..."
Abstract

Cited by 39 (3 self)
 Add to MetaCart
The local search algorithm WSat is one of the most successful algorithms for solving the satisfiability (SAT) problem. It is notably e#ective at solving hard Random 3SAT instances near the socalled `satisfiability threshold', but still shows a peak in search cost near the threshold and large variations in cost over di#erent instances. We make a number of significant contributions to the analysis of WSat on highcost random instances, using the recentlyintroduced concept of the backbone of a SAT instance. The backbone is the set of literals which are entailed by an instance. We find that the number of solutions predicts the cost well for smallbackbone instances but is much less relevant for the largebackbone instances which appear near the threshold and dominate in the overconstrained region. We show a very strong correlation between search cost and the Hamming distance to the nearest solution early in WSat's search. This pattern leads us to introduce a measure of the ba...
Learning to Reason with a Restricted View
, 1998
"... The Learning to Reason framework combines the study of Learning and Reasoning into a single task. Within it, learning is done specifically for the purpose of reasoning with the learned knowledge. Computational considerations show that this is a useful paradigm; in some cases learning and reasoning p ..."
Abstract

Cited by 31 (15 self)
 Add to MetaCart
The Learning to Reason framework combines the study of Learning and Reasoning into a single task. Within it, learning is done specifically for the purpose of reasoning with the learned knowledge. Computational considerations show that this is a useful paradigm; in some cases learning and reasoning problems that are intractable when studied separately become tractable when performed as a task of Learning to Reason. In this paper we study Learning to Reason problems where the interaction with the world supplies the learner only partial information in the form of partial assignments. Several natural interpretations of partial assignments are considered and learning and reasoning algorithms using these are developed. The results presented exhibit a tradeoff between learnability, the strength of the oracles used in the interface, and the range of reasoning queries the learner is guaranteed to answer correctly.
A new approach to model counting
 In 8th SAT, volume 3569 of LNCS
, 2005
"... Abstract. We introduce ApproxCount, an algorithm that approximates the number of satisfying assignments or models of a formula in propositional logic. Many AI tasks, such as calculating degree of belief and reasoning in Bayesian networks, are computationally equivalent to model counting. It has been ..."
Abstract

Cited by 25 (7 self)
 Add to MetaCart
Abstract. We introduce ApproxCount, an algorithm that approximates the number of satisfying assignments or models of a formula in propositional logic. Many AI tasks, such as calculating degree of belief and reasoning in Bayesian networks, are computationally equivalent to model counting. It has been shown that model counting in even the most restrictive logics, such as Horn logic, monotone CNF and 2CNF, is intractable in the worstcase. Moreover, even approximate model counting remains a worstcase intractable problem. So far, most practical model counting algorithms are based on backtrack style algorithms such as the DPLL procedure. These algorithms typically yield exact counts but are limited to relatively small formulas. Our ApproxCount algorithm is based on SampleSat, a new algorithm that samples from the solution space of a propositional logic formula nearuniformly. We provide experimental results for formulas from a variety of domains. The algorithm produces good estimates for formulas much larger than those that can be handled by existing algorithms. 1