Results 1 - 10
of
12
Memory-bounded A* graph search
- In Proc. 15th International Flairs Conference
, 2002
"... We describe a framework for reducing the space complexity of graph search algorithms such as A* that use Open and Closed lists to keep track of the frontier and interior nodes of the search space. We propose a sparse representation of the Closed list in which only a fraction of already expanded node ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
We describe a framework for reducing the space complexity of graph search algorithms such as A* that use Open and Closed lists to keep track of the frontier and interior nodes of the search space. We propose a sparse representation of the Closed list in which only a fraction of already expanded nodes need to be stored to perform the two functions of the Closed List- preventing duplicate search effort and allowing solution extraction. Our proposal is related to earlier work on search algorithms that do not use a Closed list at all [Korf and Zhang, 2000]. However, the approach we describe has several advantages that make it effective for a wider variety of problems. 1
A polyhedral approach to sequence alignment problems
- Discrete Appl. Math
, 2000
"... We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branchand-cut algorithms. The Generalized Maximum Trace formulation captures several forms of multiple sequence alignment problems in a common framewor ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branchand-cut algorithms. The Generalized Maximum Trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them the original formulation of Maximum Trace. The RNA Sequence Alignment Problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facet-defining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branch-and-cut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.
Sweep A*: Space-efficient heuristic search in partially ordered graphs
- In Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
, 2003
"... We describe a novel heuristic search algorithm, called Sweep A*, that exploits the regular structure of partially ordered graphs to substantially reduce the memory requirements of search. We show that it outperforms previous search algorithms in optimally aligning multiple protein or DNA sequences, ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
We describe a novel heuristic search algorithm, called Sweep A*, that exploits the regular structure of partially ordered graphs to substantially reduce the memory requirements of search. We show that it outperforms previous search algorithms in optimally aligning multiple protein or DNA sequences, an important problem in bioinformatics. Sweep A * also promises to be effective for other search problems with similar structure. 1.
Memory-Efficient A* Heuristics for Multiple Sequence Alignment
- IN PROCEEDINGS OF THE 18TH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02
, 2002
"... The time and space needs of an A* search are strongly influenced by the quality of the heuristic evaluation function. Usually there is a ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
The time and space needs of an A* search are strongly influenced by the quality of the heuristic evaluation function. Usually there is a
Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics
, 2002
"... ..."
K-group A* for multiple sequence alignment with quasinatural gap costs
- In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-04
, 2004
"... Alignment of multiple protein or DNA sequences is an important problem in Bioinformatics. Previous work has shown that the A * search algorithm can find optimal alignments for up to several sequences, and that a K-group generalization of A * can find approximate alignments for much larger numbers of ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Alignment of multiple protein or DNA sequences is an important problem in Bioinformatics. Previous work has shown that the A * search algorithm can find optimal alignments for up to several sequences, and that a K-group generalization of A * can find approximate alignments for much larger numbers of sequences [6]. In this paper, we describe the first implementation of K-group A * that uses quasinatural gap costs, the cost model used in practice by biologists. We also introduce a new method for computing gap-opening costs in profile alignment. Our results show that K-group A * can efficiently find optimal or close-tooptimal alignments for small groups of sequences, and, for large numbers of sequences, it can find higher-quality alignments than the widely-used CLUSTAL family of approximate alignment tools. This demonstrates the benefits of A* in aligning large numbers of sequences, as typically compared by biologists, and suggests that K-group A * could become a practical tool for multiple sequence alignment. 1.
Bioinformatics
, 2003
"... Selection of significant genes via expression patterns is an important problem in microarray experiments. Owing to small sample size and the large number of variables (genes), the selection process can be unstable. This paper proposes a hierarchical Bayesian model for gene (variable) selection. We e ..."
Abstract
- Add to MetaCart
Selection of significant genes via expression patterns is an important problem in microarray experiments. Owing to small sample size and the large number of variables (genes), the selection process can be unstable. This paper proposes a hierarchical Bayesian model for gene (variable) selection. We employ latent variables to specialize the model to a regression setting and uses a Bayesian mixture prior to perform the variable selection. We control the size of the model by assigning a prior distribution over the dimension (number of significant genes) of the model. The posterior distributions of the parameters are not in explicit form and we need to use a combination of truncated sampling and Markov Chain Monte Carlo (MCMC) based computation techniques to simulate the parameters from the posteriors. The Bayesian model is flexible enough to identify significant genes as well as to perform future predictions. The method is applied to cancer classification via cDNA microarrays where the genes BRCA1 and BRCA2 are associated with a hereditary disposition to breast cancer, and the method is used to identify a set of significant genes. The method is also applied successfully to the leukemia data.
Sorting with Line Storage Systems
"... We consider a problem that arises in car production. In particular, we focus on the sorting of car bodies with respect to their designated enamel colors for a paint shop. Our objective is to sort a given car body sequence so that the number of color changes within the sorted sequence is minimized. C ..."
Abstract
- Add to MetaCart
We consider a problem that arises in car production. In particular, we focus on the sorting of car bodies with respect to their designated enamel colors for a paint shop. Our objective is to sort a given car body sequence so that the number of color changes within the sorted sequence is minimized. Current technology is to perform the realignment of a sequence by the use of a line storage system.

