Results 1 -
6 of
6
Constructing Computer Virus Phylogenies
, 1996
"... . There has been much recent algorithmic work on the problem of reconstructing the evolutionary history of biological species. Computer virus specialists are interested in finding the evolutionary history of computer viruses --- a virus is often written using code fragments from one or more other vi ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
. There has been much recent algorithmic work on the problem of reconstructing the evolutionary history of biological species. Computer virus specialists are interested in finding the evolutionary history of computer viruses --- a virus is often written using code fragments from one or more other viruses, which are its immediate ancestors. A phylogeny for a collection of computer viruses is a directed acyclic graph whose nodes are the viruses and whose edges map ancestors to descendants and satisfy the property that each code fragment is "invented" only once. To provide a simple explanation for the data, we consider the problem of constructing such a phylogeny with a minimum number of edges. This optimization problem is NP-hard, and we present positive and negative results for associated approximation problems. When tree solutions exist, they can be constructed and randomly sampled in polynomial time. 1 Introduction There are now several thousand different computer viruses in existenc...
Convex recolorings of strings and trees: Definitions, hardness results, and algorithms
- Proceedings WADS 2005: 9th International Workshop on Algorithms and Data Structures
, 2005
"... A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex colorings of trees arise in areas such as phylogenetics, lingui ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex colorings of trees arise in areas such as phylogenetics, linguistics, etc. e.g., a perfect phylogenetic tree is one in which the states of each character induce a convex coloring of the tree. When a coloring of a tree is not convex, it is desirable to know ”how far ” it is from a convex one, and what are the convex colorings which are ”closest ” to it. In this paper we study a natural definition of this distance- the recoloring distance, which is the minimal number of color changes at the vertices needed to make the coloring convex. We show that finding this distance is NPhard even for a colored string (a path), and for some other interesting variants of the problem. In the positive side, we present algorithms for computing the recoloring distance under some natural generalizations of this concept: the first generalization is the uniform weighted model, where each vertex has a weight which is the cost of changing its color. The other is the nonuniform model, in which the cost of coloring a vertex v by a color d is an arbitrary nonnegative number cost(v, d). Our first algorithms find optimal convex recolorings of strings and bounded degree trees under the non-uniform model in time which, for any fixed number of colors, is linear in the input size. Next we improve these algorithm for the uniform model to run in time which is linear in the input size for a fixed number of bad colors, which are colors which violate convexity in some natural sense. Finally, we generalize the above result to hold for trees of unbounded degree. ∗ A preliminary version of some of the results in this paper appeared in [17].
Efficient approximation of convex recolorings
- In Proceedings APPROX 2005: 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, published with Proceedings RANDOM 2005
, 2005
"... A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex coloring of trees arises in areas such as phylogenetics, lingui ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex coloring of trees arises in areas such as phylogenetics, linguistics, etc. e.g., a perfect phylogenetic tree is one in which the states of each character induce a convex coloring of the tree. Research on perfect phylogeny is usually focused on finding a tree so that few predetermined partial colorings of its vertices are convex. When a coloring of a tree is not convex, it is desirable to know ”how far ” it is from a convex one. In [18], a natural measure for this distance, called the recoloring distance was defined: the minimal number of color changes at the vertices needed to make the coloring convex. This can be viewed as minimizing the number of “exceptional vertices ” w.r.t. to a closest convex coloring. The problem was proved to be NP-hard even for colored strings. In this paper we continue the work of [18], and present a 2-approximation algorithm of convex recoloring of strings whose running time O(cn), where c is the number of colors and n is the size of the input, and an O(cn 2) 3-approximation algorithm for convex recoloring of trees. ∗ A preliminary version of the results in this paper appeared in [19].
Improved approximation algorithm for convex recoloring of trees
- In Proceedings Third Workshop on Approximation and Online Algorithms WAOA 2005
, 2005
"... Abstract A pair (T, C) of a tree T and a coloring C is called a colored tree. Given a colored tree (T, C)any coloring C0 of T is called a recoloring of T. Given a weight function on the vertices of thetree the recoloring distance of a recoloring is the total weight of recolored vertices. A coloring ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Abstract A pair (T, C) of a tree T and a coloring C is called a colored tree. Given a colored tree (T, C)any coloring C0 of T is called a recoloring of T. Given a weight function on the vertices of thetree the recoloring distance of a recoloring is the total weight of recolored vertices. A coloring of a tree is convex if for any two vertices u and v that are colored by the same color c, everyvertex on the path from u to v is also colored by c. In the minimum convex recoloring problemwe are given a colored tree and a weight function and our goal is to find a convex recoloring of minimum recoloring distance.The minimum convex recoloring problem naturally arises in the context of phylogenetic trees. Given a set of related species the goal of phylogenetic reconstruction is to construct a tree thatwould best describe the evolution of this set of species. In this context a convex coloring corresponds to perfect phylogeny. Since perfect phylogeny is not always possible the next bestthing is to find a tree which is as close to convex as possible, or, in other words, a tree with minimum recoloring distance.We present a (2 + ")-approximation algorithm for the minimum convex recoloring problem,whose running time is O(n2 + n(1/")241/"). This result improves the previously known 3-approximation algorithm for this NP-hard problem. We also present an algorithm for computing
Approximation Algorithms for the Fixed-Topology Phylogenetic Number Problem
, 1999
"... . In the `-phylogeny problem, one wishes to construct an evolutionary tree for a set of species represented by characters, in which each state of each character induces no more than ` connected components. We consider the fixed-topology version of this problem for fixed-topologies of arbitrary degr ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
. In the `-phylogeny problem, one wishes to construct an evolutionary tree for a set of species represented by characters, in which each state of each character induces no more than ` connected components. We consider the fixed-topology version of this problem for fixed-topologies of arbitrary degree. This version of the problem is known to be NP-complete for ` 3 even for degree-3 trees in which no state labels more than ` + 1 leaves (and therefore there is a trivial ` + 1 phylogeny). We give a 2-approximation algorithm for all ` 3 for arbitrary input topologies and we give an optimal approximation algorithm that constructs a 4-phylogeny when a 3-phylogeny exists. Dynamic programming techniques, which are typically used in fixed-topology problems, cannot be applied to `-phylogeny problems. Our 2-approximation algorithm is the first application of linear programming to approximation algorithms for phylogeny problems. We extend our results to a related problem in which characters are...
Learning and Approximation Algorithms for problems motivated by Evolutionary Trees
, 1999
"... vi Chapter 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Biological Background . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Models and Methods . . . . . . ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
vi Chapter 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Biological Background . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Models and Methods . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Learning in the General Markov Model . . . . . . . . . . . . . . . 15 1.3.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.2 Learning Problems for Evolutionary Trees . . . . . . . . . 19 1.4 Layout of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Chapter 2 Learning Two-State Markov Evolutionary Trees 28 2.1 Previous research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.1.1 The General Idea . . . . . . . . . . . . . . . . . . . . . . . . 28 2.1.2 Previous work on learning the distribution . . . . . . . . . 34 2.1.3 Previous work on finding the topology . . . . . . . . . . . . 39 ii 2.1.4 Re...

