Results 1  10
of
22
OPUS: An efficient admissible algorithm for unordered search
 Journal of Artificial Intelligence Research
, 1995
"... OPUS is a branch and bound search algorithm that enables efficient admissible search through spaces for which the order of search operator application is not significant. The algorithm’s search efficiency is demonstrated with respect to very large machine learning search spaces. The use of admissibl ..."
Abstract

Cited by 75 (14 self)
 Add to MetaCart
OPUS is a branch and bound search algorithm that enables efficient admissible search through spaces for which the order of search operator application is not significant. The algorithm’s search efficiency is demonstrated with respect to very large machine learning search spaces. The use of admissible search is of potential value to the machine learning community as it means that the exact learning biases to be employed for complex learning tasks can be precisely specified and manipulated. OPUS also has potential for application in other areas of artificial intelligence, notably, truth maintenance. 1.
Towards A Discipline Of Experimental Algorithmics
"... The last 20 years have seen enormous progress in the design of algorithms, but very little of it has been put into practice, even within academia; indeed, the gap between theory and practice has continuously widened over these years. Moreover, many of the recently developed algorithms are very hard ..."
Abstract

Cited by 36 (8 self)
 Add to MetaCart
The last 20 years have seen enormous progress in the design of algorithms, but very little of it has been put into practice, even within academia; indeed, the gap between theory and practice has continuously widened over these years. Moreover, many of the recently developed algorithms are very hard to characterize theoretically and, as initially described, suffer from large runningtime coefficients. Thus the algorithms and data structures community needs to return to implementation as the standard of value; we call such an approach Experimental Algorithmics. Experimental Algorithmics studies algorithms and data structures by joining experimental studies with the more traditional theoretical analyses. Experimentation with algorithms and data structures is proving indispensable in the assessment of heuristics for hard problems, in the design of test cases, in the characterization of asymptotic behavior of complex algorithms, in the comparison of competing designs for tractabl...
Tight approximability results for test set problems in bioinformatics
 Proc. 9th Scandinavian Workshop on Algorithm Theory (SWAT), volume 3111 of Lecture Notes in Computer Science
, 2005
"... Abstract. In this paper, we investigate the test set problem and its variations that appear in a variety of applications. In general, we are given a universe of objects to be “distinguished ” by a family of “tests”, and we want to find the smallest sufficient collection of tests. In the simplest ver ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
Abstract. In this paper, we investigate the test set problem and its variations that appear in a variety of applications. In general, we are given a universe of objects to be “distinguished ” by a family of “tests”, and we want to find the smallest sufficient collection of tests. In the simplest version, a test is a subset of the universe and two objects are distinguished by our collection if one test contains exactly one of them. Variations allow tests to be multivalued functions or unions of “basic” tests, and different notions of the term distinguished. An important version of this problem that has applications in DNA sequence analysis has the universe consisting of strings over a small alphabet and tests that are detecting presence (or absence) of a substring. For most versions of the problem, including the latter, we establish matching lower and upper bounds on approximation ratio. When tests can be formed as unions of basic tests, we show that the problem is as hard as the graph coloring problem. 1
Algorithms and Experiments: The New (and Old) Methodology
 J. Univ. Comput. Sci
, 2001
"... The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large runningtime coefficients, the gap between theory and practice has widened over ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large runningtime coefficients, the gap between theory and practice has widened over these years. Experimentation is indispensable in the assessment of heuristics for hard problems, in the characterization of asymptotic behavior of complex algorithms, and in the comparison of competing designs for tractable problems. Implementation, although perhaps not rigorous experimentation, was characteristic of early work in algorithms and data structures. Donald Knuth has throughout insisted on testing every algorithm and conducting analyses that can predict behavior on actual data; more recently, Jon Bentley has vividly illustrated the difficulty of implementation and the value of testing. Numerical analysts have long understood the need for standardized test suites to ensure robustness, precision and efficiency of numerical libraries. It is only recently, however, that the algorithms community has shown signs of returning to implementation and testing as an integral part of algorithm development. The emerging disciplines of experimental algorithmics and algorithm engineering have revived and are extending many of the approaches used by computing pioneers such as Floyd and Knuth and are placing on a formal basis many of Bentley's observations. We reflect on these issues, looking back at the last thirty years of algorithm development and forward to new challenges: designing cacheaware algorithms, algorithms for mixed models of computation, algorithms for external memory, and algorithms for scientific research.
Identifying Codes and the Set Cover Problem
, 2006
"... We consider the problem of finding a minimum identifying code in a graph, i.e., a designated set of vertices whose neighborhoods uniquely overlap at any vertex on the graph. This identifying code problem was initially introduced in 1998 and has been since fundamentally connected to a wide range of a ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
We consider the problem of finding a minimum identifying code in a graph, i.e., a designated set of vertices whose neighborhoods uniquely overlap at any vertex on the graph. This identifying code problem was initially introduced in 1998 and has been since fundamentally connected to a wide range of applications, including fault diagnosis, location detection, environmental monitoring, and connections to information theory, superimposed codes, and tilings. Though this problem is NPcomplete, its known reduction is from 3SAT and does not readily yield an approximation algorithm. In this paper we show that the identifying code problem is computationally equivalent to the set cover problem and present a Θ(log n)approximation algorithm based on the greedy approach for set cover; we further show that, subject to reasonable assumptions, no polynomialtime approximation algorithm can do better. Finally, we show that a generalization of the identifying codes problem, for which no complexity results were known thusfar, is NPhard. 1
OPUS: A systematic search algorithm and its application to categorical attributevalue datadriven machine learning.
, 1993
"... OPUS is a branch and bound search algorithm that enables efficient systematic search through spaces in which the order in which the search operators are applied is not significant. OPUS achieves this by maximising the effect of each pruning action. While it is not possible to guarantee in the genera ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
OPUS is a branch and bound search algorithm that enables efficient systematic search through spaces in which the order in which the search operators are applied is not significant. OPUS achieves this by maximising the effect of each pruning action. While it is not possible to guarantee in the general case that any pruning shall occur, when pruning is possible, its effect is maximised. Experimental application of OPUS in datadriven machine learning demonstrates that NP hard search problems in which it is not possible to guarantee a solution in reasonable time can be solved for real world data within acceptable time frames. Indeed, OPUS is demonstrated to enable systematic search of extremely large search spaces in less time than is taken by common heuristic machine learning search algorithms. The use of systematic search in concept learning enables better experimental comparison of alternative inductive biases than was previously possible as the precise inductive bias can be described ...
Identifying codes and covering problems
 IEEE Transaction on Information Theory
, 2008
"... The identifying code problem for a given graph involves finding a minimum set of vertices whose neighborhoods uniquely overlap at any given graph vertex. Initially introduced in 1998, this problem has demonstrated its fundamental nature through a wide variety of applications, such as fault diagnosis ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
The identifying code problem for a given graph involves finding a minimum set of vertices whose neighborhoods uniquely overlap at any given graph vertex. Initially introduced in 1998, this problem has demonstrated its fundamental nature through a wide variety of applications, such as fault diagnosis, location detection, and environmental monitoring, in addition to deep connections to information theory, superimposed and covering codes, and tilings. This work establishes efficient reductions between the identifying code problem and the wellknown setcovering problem, resulting in a tight hardness of approximation result and novel, provably tight polynomialtime approximations. The main results are also extended to rrobust identifying codes and analogous set (2r + 1)multicover problems. Finally, empirical support is provided for the effectiveness of the proposed approximations, including good constructions for wellknown topologies such as infinite twodimensional grids.
A tighter analysis of set cover greedy algorithm for test set
 In ESCAPE
, 2007
"... Abstract. Set cover greedy algorithm is a natural approximation algorithm for test set problem. This paper gives a precise and tighter analysis of approximation ratio of this algorithm. The author improves the approximation ratio 2 ln n directly derived from set cover to 1.14 ln n by applying potent ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract. Set cover greedy algorithm is a natural approximation algorithm for test set problem. This paper gives a precise and tighter analysis of approximation ratio of this algorithm. The author improves the approximation ratio 2 ln n directly derived from set cover to 1.14 ln n by applying potential function technique of derandomization method. In addition, the author gives a nontrivial lower bound (1+α)lnn of approximation ratio, where α is a positive constant. This lower bound, together with the matching bound of information content heuristic, confirms the fact information content heuristic is slightly better than set cover greedy algorithm in worst case. 1
Inclusive pruning: A new class of pruning rule for unordered search and its application to classification learning.
 In Proceedings of the Nineteenth Australasian Computer Science Conference
, 1996
"... This paper presents a new class of pruning rule for unordered search. Previous pruning rules for unordered search identify operators that should not be applied in order to prune nodes reached via those operators. In contrast, the new pruning rules identify operators that should be applied and prune ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This paper presents a new class of pruning rule for unordered search. Previous pruning rules for unordered search identify operators that should not be applied in order to prune nodes reached via those operators. In contrast, the new pruning rules identify operators that should be applied and prune nodes that are not reached via those operators. Specific pruning rules employing both these approaches are identified for classification learning. Experimental results demonstrate that application of the new pruning rules can reduce by more than 60% the number of states from the search space that are considered during classification learning.
Efficient Algorithms for Masking and Finding QuasiIdentifiers
, 2007
"... A quasiidentifier refers to a subset of attributes that can uniquely identify most tuples in a table. Incautious publication of quasiidentifiers will lead to privacy leakage. In this paper we consider the problems of finding and masking quasiidentifiers. Both problems are provably hard with sever ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
A quasiidentifier refers to a subset of attributes that can uniquely identify most tuples in a table. Incautious publication of quasiidentifiers will lead to privacy leakage. In this paper we consider the problems of finding and masking quasiidentifiers. Both problems are provably hard with severe time and space requirements. We focus on designing efficient approximation algorithms for large data sets. We first propose two natural measures for quantifying quasiidentifiers: distinct ratio and separation ratio. We develop efficient algorithms that find small quasiidentifiers with provable size and separation/distinct ratio guarantees, with space and time requirements sublinear in the number of tuples. We also design practical algorithms for finding all minimal quasiidentifiers. Finally we propose efficient algorithms for masking quasiidentifiers, where we use a random sampling technique to greatly reduce the space and time requirements, without much sacrifice in the quality of the results. Our algorithms for masking and finding minimum quasiidentifiers naturally apply to stream databases. Extensive experimental results on real world data sets confirm efficiency and accuracy of our algorithms.