Generating Hard Satisfiability Problems
 Artificial Intelligence
, 1996
"... We report results from largescale experiments in satisfiability testing. As has been observed by others, testing the satisfiability of random formulas often appears surprisingly easy. Here we show that by using the right distribution of instances, and appropriate parameter values, it is possible ..."
Cited by 98 (2 self)
We report results from largescale experiments in satisfiability testing. As has been observed by others, testing the satisfiability of random formulas often appears surprisingly easy. Here we show that by using the right distribution of instances, and appropriate parameter values, it is possible to generate random formulas that are hard, that is, for which satisfiability testing is quite difficult. Our results provide a benchmark for the evaluation of satisfiabilitytesting procedures. In Artificial Intelligence, 81 (19996) 1729. 1 Introduction Many computational tasks of interest to AI, to the extent that they can be precisely characterized at all, can be shown to be NPhard in their most general form. However, there is fundamental disagreement, at least within the AI community, about the implications of this. It is claimed on the one hand that since the performance of algorithms designed to solve NPhard tasks degrades rapidly with small increases in input size, something ...
Tractable Reasoning via Approximation
 Artificial Intelligence
, 1995
"... Problems in logic are wellknown to be hard to solve in the worst case. Two different strategies for dealing with this aspect are known from the literature: language restriction and theory approximation. In this paper we are concerned with the second strategy. Our main goal is to define a semantical ..."
Cited by 95 (0 self)
Problems in logic are wellknown to be hard to solve in the worst case. Two different strategies for dealing with this aspect are known from the literature: language restriction and theory approximation. In this paper we are concerned with the second strategy. Our main goal is to define a semantically wellfounded logic for approximate reasoning, which is justifiable from the intuitive point of view, and to provide fast algorithms for dealing with it even when using expressive languages. We also want our logic to be useful to perform approximate reasoning in different contexts. We define a method for the approximation of decision reasoning problems based on multivalued logics. Our work expands and generalizes in several directions ideas presented by other researchers. The major features of our technique are: 1) approximate answers give semantically clear information about the problem at hand; 2) approximate answers are easier to compute than answers to the original problem; 3) approxim...
A Taxonomy of Complexity Classes of Functions
 Journal of Computer and System Sciences
, 1992
"... This paper comprises a systematic comparison of several complexity classes of functions that are computed nondeterministically in polynomial time or with an oracle in NP. There are three components to this work. ffl A taxonomy is presented that demonstrates all known inclusion relations of these cla ..."
Cited by 88 (12 self)
This paper comprises a systematic comparison of several complexity classes of functions that are computed nondeterministically in polynomial time or with an oracle in NP. There are three components to this work. ffl A taxonomy is presented that demonstrates all known inclusion relations of these classes. For (nearly) each inclusion that is not shown to hold, evidence is presented to indicate that the inclusion is false. As an example, consider FewPF, the class of multivalued functions that are nondeterministically computable in polynomial time such that for each x, there is a polynomial bound on the number of distinct output values of f(x). We show that FewPF ` PF NP tt . However, we show PF NP tt ` FewPF if and only if NP = coNP, and thus PF NP tt ` FewPF is likely to be false. ffl Whereas it is known that P NP (O(log n)) = P NP tt ` P NP [Hem87, Wagb, BH88], we show that PF NP (O(log n)) = PF NP tt implies P = FewP and R = NP. Also, we show that PF NP tt = PF ...
Robust Trainability of Single Neurons
, 1995
"... It is well known that (McCullochPitts) neurons are efficiently trainable to learn an unknown halfspace from examples, using linearprogramming methods. We want to analyze how the learning performance degrades when the representational power of the neuron is overstrained, i.e., if more complex conce ..."
Cited by 85 (0 self)
It is well known that (McCullochPitts) neurons are efficiently trainable to learn an unknown halfspace from examples, using linearprogramming methods. We want to analyze how the learning performance degrades when the representational power of the neuron is overstrained, i.e., if more complex concepts than just halfspaces are allowed. We show that the problem of learning a probably almost optimal weight vector for a neuron is so difficult that the minimum error cannot even be approximated to within a constant factor in polynomial time (unless RP = NP); we obtain the same hardness result for several variants of this problem. We considerably strengthen these negative results for neurons with binary weights 0 or 1. We also show that neither heuristical learning nor learning by sigmoidal neurons with a constant reject rate is efficiently possible (unless RP = NP).
Unit Disk Graph Recognition is NPHard
 Computational Geometry. Theory and Applications
, 1993
"... Unit disk graphs are the intersection graphs of unit diameter closed disks in the plane. This paper reduces SATISFIABILITY to the problem of recognizing unit disk graphs. Equivalently, it shows that determining if a graph has sphericity 2 or less, even if the graph is planar or is known to have s ..."
Cited by 79 (1 self)
Unit disk graphs are the intersection graphs of unit diameter closed disks in the plane. This paper reduces SATISFIABILITY to the problem of recognizing unit disk graphs. Equivalently, it shows that determining if a graph has sphericity 2 or less, even if the graph is planar or is known to have sphericity at most 3, is NPhard. We show how this reduction can be extended to 3 dimensions, thereby showing that unit sphere graph recognition, or determining if a graph has sphericity 3 or less, is also NPhard. We conjecture that Ksphericity is NPhard for all fixed K greater than 1. 1 Introduction A unit disk graph is the intersection graph of a set of unit diameter closed disks in the plane. That is, each vertex corresponds to a disk in the plane, and two vertices are adjacent in the graph if the corresponding disks intersect. The set of disks is said to realize the graph. Of course, the unit of distance is not critical, since the disks realize the same graph even if the coordina...
Branching Rules for Satisfiability
 Journal of Automated Reasoning
, 1995
"... Recent experience suggests that branching algorithms are among the most attractive for solving propositional satisfiability problems. A key factor in their success is the rule they use to decide on which variable to branch next. We attempt to explain and improve the performance of branching rules wi ..."
Cited by 78 (2 self)
Recent experience suggests that branching algorithms are among the most attractive for solving propositional satisfiability problems. A key factor in their success is the rule they use to decide on which variable to branch next. We attempt to explain and improve the performance of branching rules with an empirical modelbuilding approach. One model is based on the rationale given for the JeroslowWang rule, variations of which have performed well in recent work. The model is refuted by carefully designed computational experiments. A second model explains the success of the JeroslowWang rule, makes other predictions confirmed by experiment, and leads to the design of branching rules that are clearly superior to JeroslowWang. Recent computational studies [2, 7, 13, 21] suggest that branching algorithms are among the most attractive for solving the propositional satisfiability problem. An important factor in their successperhaps the dominant factoris the branching rule they use [...
Structure Identification in Relational Data
, 1997
"... This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is give ..."
Cited by 77 (2 self)
This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is given a precise formal definition similar to that of learnability. Using this framework, we then explore if a tractable procedure exists for deciding whether a given relation is decomposable into a constraint network or a CNF theory with desirable topology and, if the answer is positive, identifying the desired decomposition. Finally, we
Efficient reconstruction of haplotype structure via perfect phylogeny
 Journal of Bioinformatics and Computational Biology
, 2003
"... Each person’s genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person’s genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a h ..."
Cited by 73 (11 self)
Each person’s genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person’s genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases. The haplotype mapping project, a successor to the human genome project, seeks to determine the common haplotypes in the human population. Since experimental determination of a person’s genotype is less expensive than determining its component haplotypes, algorithms are required for computing haplotypes from genotypes. Two observations aid in this process: first, the human genome contains short blocks within which only a few different haplotypes occur; second, as suggested by Gusfield, it is reasonable to assume that the haplotypes observed within a block have evolved according to a perfect phylogeny, in which at most one mutation event has occurred at any site, and no recombination occurred at the given region. We present a simple and efficient polynomialtime algorithm for inferring haplotypes from the genotypes of a set of individuals assuming a perfect phylogeny. Using a reduction to 2SAT we extend this algorithm to handle constraints that apply when we have genotypes from both parents and child. We also present a hardness result for the problem of removing the minimum number of individuals from a population to ensure that the genotypes of the remaining individuals are consistent with a perfect phylogeny. Our algorithms have been tested on real data and give biologically meaningful results. Our webserver
Data allocation in distributed database systems
 ACM Transactions on Database Systems
, 1988
"... The problem of allocating the data of a database to the sites of a communication network is investigated. This problem deviates from the wellknown file allocation problem in several aspects. First, the objects to be allocated are not known a priori; second, these objects are accessed by schedules t ..."
Cited by 72 (1 self)
The problem of allocating the data of a database to the sites of a communication network is investigated. This problem deviates from the wellknown file allocation problem in several aspects. First, the objects to be allocated are not known a priori; second, these objects are accessed by schedules that contain transmissions between objects to produce the result. A model that makes it possible to compare the cost of allocations is presented, the cost can be computed for different cost functions and for processing schedules produced by arbitrary query processing algorithms. For minimizing the total transmission cost, a method is proposed to determine the fragments to be allocated from the relations in the conceptual schema and the queries and updates executed by the users. For the same cost function, the complexity of the data allocation problem is investigated. Methods for obtaining optimal and heuristic solutions under various ways of computing the cost of an allocation are presented and compared. Two different approaches to the allocation management problem are presented and their merits are discussed.
Lower bounds for random 3SAT via differential equations
 THEORETICAL COMPUTER SCIENCE
, 2001
