Results 1 
9 of
9
Constructing Computer Virus Phylogenies
, 1996
"... . There has been much recent algorithmic work on the problem of reconstructing the evolutionary history of biological species. Computer virus specialists are interested in finding the evolutionary history of computer viruses  a virus is often written using code fragments from one or more other vi ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
. There has been much recent algorithmic work on the problem of reconstructing the evolutionary history of biological species. Computer virus specialists are interested in finding the evolutionary history of computer viruses  a virus is often written using code fragments from one or more other viruses, which are its immediate ancestors. A phylogeny for a collection of computer viruses is a directed acyclic graph whose nodes are the viruses and whose edges map ancestors to descendants and satisfy the property that each code fragment is "invented" only once. To provide a simple explanation for the data, we consider the problem of constructing such a phylogeny with a minimum number of edges. This optimization problem is NPhard, and we present positive and negative results for associated approximation problems. When tree solutions exist, they can be constructed and randomly sampled in polynomial time. 1 Introduction There are now several thousand different computer viruses in existenc...
Efficient approximation of convex recolorings
 In Proceedings APPROX 2005: 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, published with Proceedings RANDOM 2005
, 2005
"... A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex coloring of trees arises in areas such as phylogenetics, lingui ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex coloring of trees arises in areas such as phylogenetics, linguistics, etc. e.g., a perfect phylogenetic tree is one in which the states of each character induce a convex coloring of the tree. Research on perfect phylogeny is usually focused on finding a tree so that few predetermined partial colorings of its vertices are convex. When a coloring of a tree is not convex, it is desirable to know ”how far ” it is from a convex one. In [18], a natural measure for this distance, called the recoloring distance was defined: the minimal number of color changes at the vertices needed to make the coloring convex. This can be viewed as minimizing the number of “exceptional vertices ” w.r.t. to a closest convex coloring. The problem was proved to be NPhard even for colored strings. In this paper we continue the work of [18], and present a 2approximation algorithm of convex recoloring of strings whose running time O(cn), where c is the number of colors and n is the size of the input, and an O(cn 2) 3approximation algorithm for convex recoloring of trees. ∗ A preliminary version of the results in this paper appeared in [19].
Convex recolorings of strings and trees: Definitions, hardness results, and algorithms
 Proceedings WADS 2005: 9th International Workshop on Algorithms and Data Structures
, 2005
"... A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex colorings of trees arise in areas such as phylogenetics, lingui ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex colorings of trees arise in areas such as phylogenetics, linguistics, etc. e.g., a perfect phylogenetic tree is one in which the states of each character induce a convex coloring of the tree. When a coloring of a tree is not convex, it is desirable to know ”how far ” it is from a convex one, and what are the convex colorings which are ”closest ” to it. In this paper we study a natural definition of this distance the recoloring distance, which is the minimal number of color changes at the vertices needed to make the coloring convex. We show that finding this distance is NPhard even for a colored string (a path), and for some other interesting variants of the problem. In the positive side, we present algorithms for computing the recoloring distance under some natural generalizations of this concept: the first generalization is the uniform weighted model, where each vertex has a weight which is the cost of changing its color. The other is the nonuniform model, in which the cost of coloring a vertex v by a color d is an arbitrary nonnegative number cost(v, d). Our first algorithms find optimal convex recolorings of strings and bounded degree trees under the nonuniform model in time which, for any fixed number of colors, is linear in the input size. Next we improve these algorithm for the uniform model to run in time which is linear in the input size for a fixed number of bad colors, which are colors which violate convexity in some natural sense. Finally, we generalize the above result to hold for trees of unbounded degree. ∗ A preliminary version of some of the results in this paper appeared in [17].
Improved approximation algorithm for convex recoloring of trees
 In Proceedings Third Workshop on Approximation and Online Algorithms WAOA 2005
, 2005
"... Abstract A pair (T, C) of a tree T and a coloring C is called a colored tree. Given a colored tree (T, C)any coloring C0 of T is called a recoloring of T. Given a weight function on the vertices of thetree the recoloring distance of a recoloring is the total weight of recolored vertices. A coloring ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Abstract A pair (T, C) of a tree T and a coloring C is called a colored tree. Given a colored tree (T, C)any coloring C0 of T is called a recoloring of T. Given a weight function on the vertices of thetree the recoloring distance of a recoloring is the total weight of recolored vertices. A coloring of a tree is convex if for any two vertices u and v that are colored by the same color c, everyvertex on the path from u to v is also colored by c. In the minimum convex recoloring problemwe are given a colored tree and a weight function and our goal is to find a convex recoloring of minimum recoloring distance.The minimum convex recoloring problem naturally arises in the context of phylogenetic trees. Given a set of related species the goal of phylogenetic reconstruction is to construct a tree thatwould best describe the evolution of this set of species. In this context a convex coloring corresponds to perfect phylogeny. Since perfect phylogeny is not always possible the next bestthing is to find a tree which is as close to convex as possible, or, in other words, a tree with minimum recoloring distance.We present a (2 + &quot;)approximation algorithm for the minimum convex recoloring problem,whose running time is O(n2 + n(1/&quot;)241/&quot;). This result improves the previously known 3approximation algorithm for this NPhard problem. We also present an algorithm for computing
Learning and Approximation Algorithms for problems motivated by Evolutionary Trees
, 1999
"... vi Chapter 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Biological Background . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Models and Methods . . . . . . ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
vi Chapter 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Biological Background . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Models and Methods . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Learning in the General Markov Model . . . . . . . . . . . . . . . 15 1.3.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.2 Learning Problems for Evolutionary Trees . . . . . . . . . 19 1.4 Layout of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Chapter 2 Learning TwoState Markov Evolutionary Trees 28 2.1 Previous research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.1.1 The General Idea . . . . . . . . . . . . . . . . . . . . . . . . 28 2.1.2 Previous work on learning the distribution . . . . . . . . . 34 2.1.3 Previous work on finding the topology . . . . . . . . . . . . 39 ii 2.1.4 Re...
doi:10.1093/comjnl/bxm049 FixedParameter Algorithms
"... We survey the use of fixedparameter algorithms in the field of phylogenetics, which is the study of evolutionary relationships. The central problem in phylogenetics is the reconstruction of the evolutionary history of biological species, but its methods also apply to linguistics, philology or archi ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We survey the use of fixedparameter algorithms in the field of phylogenetics, which is the study of evolutionary relationships. The central problem in phylogenetics is the reconstruction of the evolutionary history of biological species, but its methods also apply to linguistics, philology or architecture. A basic computational problem is the reconstruction of a likely phylogeny (genealogical tree) for a set of species based on observed differences in the phenotype like color or form of limbs, based on differences in the genotype like mutated nucleotide positions in the DNA sequence, or based on given partial phylogenies. Ideally, one would like to construct socalled perfect phylogenies, which arise from a very simple evolutionary model, but in practice one must often be content with phylogenies whose ‘distance from perfection ’ is as small as possible. The computation of phylogenies has applications in seemingly unrelated areas such as genomic sequencing and finding and understanding genes. The numerous computational problems arising in phylogenetics often are NPcomplete, but for many natural parametrizations they can be solved using fixedparameter algorithms.
Approximation Algorithms for the FixedTopology Phylogenetic Number Problem
, 1999
"... . In the `phylogeny problem, one wishes to construct an evolutionary tree for a set of species represented by characters, in which each state of each character induces no more than ` connected components. We consider the fixedtopology version of this problem for fixedtopologies of arbitrary degr ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
. In the `phylogeny problem, one wishes to construct an evolutionary tree for a set of species represented by characters, in which each state of each character induces no more than ` connected components. We consider the fixedtopology version of this problem for fixedtopologies of arbitrary degree. This version of the problem is known to be NPcomplete for ` 3 even for degree3 trees in which no state labels more than ` + 1 leaves (and therefore there is a trivial ` + 1 phylogeny). We give a 2approximation algorithm for all ` 3 for arbitrary input topologies and we give an optimal approximation algorithm that constructs a 4phylogeny when a 3phylogeny exists. Dynamic programming techniques, which are typically used in fixedtopology problems, cannot be applied to `phylogeny problems. Our 2approximation algorithm is the first application of linear programming to approximation algorithms for phylogeny problems. We extend our results to a related problem in which characters are...
The Largest Compatible Subset Problem for Phylogenetic Data
, 2004
"... The phylogenetic tree construction is to infer the evolutionary relationship between species from the experimental data. However, the experimental data are often imperfect and conflicting each others. Therefore, it is important to extract the motif from the imperfect data. The largest compatible sub ..."
Abstract
 Add to MetaCart
The phylogenetic tree construction is to infer the evolutionary relationship between species from the experimental data. However, the experimental data are often imperfect and conflicting each others. Therefore, it is important to extract the motif from the imperfect data. The largest compatible subset problem is that, given a set of experimental data, we want to discard the minimum such that the remaining is compatible. The largest compatible subset problem can be viewed as the vertex cover problem in the graph theory that has been proven to be NPhard. In this paper, we propose a hybrid Evolutionary Computing (EC) method for this problem. The proposed method combines the EC approach and the algorithmic approach for special structured graphs. As a result, the complexity of the problem is dramatically reduced. Experiments were performed on randomly generated graphs with different edge densities. The vertex covers produced by the proposed method were then compared to the vertex covers produced by a 2approximation algorithm. The experimental results showed that the proposed method consistently outperformed a classical 2approximation algorithm. Furthermore, a significant improvement was found when the graph density was small.