## Analyzing and Visualizing Sequence and Distance Data Using SplitsTree (1996)

Venue: | Discrete Appl. Math |

Citations: | 18 - 2 self |

### BibTeX

@ARTICLE{Dress96analyzingand,

author = {A. Dress and D. Huson and V. Moulton},

title = {Analyzing and Visualizing Sequence and Distance Data Using SplitsTree},

journal = {Discrete Appl. Math},

year = {1996},

volume = {71},

pages = {95--109}

}

### Years of Citing Articles

### OpenURL

### Abstract

In this paper, we describe and illustrate a tool for analyzing and visualizing sequence and distance data, called the splits-graph. The construction of this graph is based upon the split-decomposition technique which is a procedure to decompose a given metric de#ned on a #nite set in a canonical wayinto a sum of simpler metrics. In a way, this technique is comparable to Fourier analysis which also decomposes a given object under consideration #that is a periodic signal# into a sum of simpler such objects, in a canonical way. The splits-graph and the theory behind it have been developed mainly in Bielefeld over the last 5 years. The procedure for producing splits-graphs has been implemented in the SplitsTree program whichwe also describe and whichisavailable from the authors. 1 Introduction One of the main problems in phylogenetic analysis is to #nd a good method for analyzing and visualizing aphylogenetic distance data set, in order to better understand the phylogenetic ...

### Citations

469 | Evolution of protein molecules - Jukes, Cantor - 1969 |

166 | The general stochastic model of nucleotide substitution - Rodrı́guez, Oliver, et al. - 1990 |

105 |
Reconstructing the shape of a tree from observed dissimilarity data
- Bandelt, Dress
- 1986
(Show Context)
Citation Context ... the resulting system of splits may not t into a tree since we may 6sencounter pairs of incompatible splits, i.e. pairs of splits A; B and A0 ;B0with U \ V 6= ; for all U 2fA; Bg and V 2fA0 ;B0g (see =-=[4]-=- for a more detailed discussion of this concept). However, as we haveindicated in Section 1, the resulting system can be represented by an associated, canonically de ned network, which we call the spl... |

102 | A canonical decomposition theory for metrics on a finite set - Bandelt, Dress - 1992 |

80 | Estimation of evolutionary distances between homologous nucleotide sequences - Kimura - 1981 |

74 |
tight extensions of metric spaces, and the cohomological dimension of certain groups: A note on combinatorial properties of metric spaces
- Dress, Trees
- 1984
(Show Context)
Citation Context ...1 This decomposition can be characterised abstractly by certain structural requirements relating to concepts from category theory, applied to the category of metric spaces and non-expanding maps, cf. =-=[11, 13, 18]-=-. 8sIn general, to measure the e ectiveness of the split decomposition procedure, the splittability index, 100 ( X d taxa i;j 1 ij =X dij); taxa i;j was introduced, which can be viewed as an indicatio... |

71 | Split decomposition: a new and useful approach to phylogenetic analysis of distance data. Molecular Phylogenetic Evolution 1 - Bandelt, Dress - 1992 |

58 | Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances - Lake - 1994 |

29 |
Spectral analysis of phylogenetic data
- Hendy, Penny
- 1993
(Show Context)
Citation Context ...leading to trees only if the data set unambiguously supports a unique tree. These less restrictive methods include the spectral analysis of phylogenetic data sets, introduced by M. Hendy and D. Penny =-=[16]-=-, the analysis of weak hierarchies associated with distance data sets [5, 8], and the split decomposition method which we describe in this paper. In general, it is impossible to reconstruct unambiguou... |

25 |
A.: Weak hierarchies associated with similarity measures: an additive clustering technique Bull
- Bandelt, Dress
- 1989
(Show Context)
Citation Context ...ships is often called cluster theory. More precisely, cluster theory aims at structuring a set X by specifying a system C(X) of subsets of X, called clusters, subject to the following conditions (see =-=[5]-=-): The clusters should collect similar objects, that is objects in a given cluster C 2C(X) should somehow be more similar to each other than to objects outside C. The clustering procedure should be re... |

21 |
Six theorems about metric spaces
- Isbell
- 1964
(Show Context)
Citation Context ...1 This decomposition can be characterised abstractly by certain structural requirements relating to concepts from category theory, applied to the category of metric spaces and non-expanding maps, cf. =-=[11, 13, 18]-=-. 8sIn general, to measure the e ectiveness of the split decomposition procedure, the splittability index, 100 ( X d taxa i;j 1 ij =X dij); taxa i;j was introduced, which can be viewed as an indicatio... |

21 | Asynchronous distance between homologous DNA sequences - Barry, Hartigan - 1987 |

17 |
Recovery of a tree from the leaf coloration it generates under a Markov model
- Steel
- 1994
(Show Context)
Citation Context ...sponding distance matrix, using one of the following transformations speci ed by the user: Hamming distances, Kimura 3ST, Jukes-Cantor, or the LogDet transformation, recently introduced by Mike Steel =-=[23]. Mor-=-eover, you can specify which of the sequences in the input le should be used, the range of sites (or \positions"), and whether to consider gap sites, non-parsimony sites, or constant sites in the... |

14 |
Fourier calculus on evolutionary trees
- Székely, Steel, et al.
- 1993
(Show Context)
Citation Context ...s is useful for determining which sites support the indicated splits. Additionally, the program contains two other methods for computing splits from a set of given sequences, namely spectral analysis =-=[16, 24]-=- followed by agreedy selection of a weakly compatible system of splits, and the calculation of p-splits [9]. Bootstrapping can also be performed on all calculations. 4 Examples of Applications of Spli... |

13 |
A relational approach to split decomposition
- Bandelt, Dress
- 1993
(Show Context)
Citation Context ...er methods for computing splits from a set of given sequences, namely spectral analysis [16, 24] followed by agreedy selection of a weakly compatible system of splits, and the calculation of p-splits =-=[9]-=-. Bootstrapping can also be performed on all calculations. 4 Examples of Applications of SplitsTree Split decomposition has been applied successfully to numerous data sets mostly from biology and pysc... |

10 |
The marsupial mitochondrial genome and the evolution of placental mammals
- Janke, Feldmaier-Fuchs, et al.
- 1994
(Show Context)
Citation Context ...raph depicted in Fig. 1. This graph is a visualisa3stion of the phylogentic distance data set obtained by analyzing mitochondrial DNA from the taxa whale, mouse, seal, rat, man, opossum, and cow (see =-=[19]-=- and [26] for more details). It is built up of parallelograms (sometimes also, more generally, from zonotopes, that is, center-symmetric polygons) and individual edges. Consequently, the geometric str... |

8 |
Zur Visualisierung abstrakter Ähnlichkeitsbeziehungen
- Wetzel
- 1995
(Show Context)
Citation Context ...ly supported by the data set. It is the purpose of this paper to discuss one such technique { the above mentioned split decomposition method { that has been fully implementedin the SplitsTree program =-=[17, 26]-=- and has proved useful in many di erent contexts. We explain some of the theory behind split decomposition in the next section and describe brie y our implementation in Section 3. Finally,we illustrat... |

7 | Combination of data in phylogenetic analysis - Bandelt - 1995 |

6 |
A canonical decomposition theory for metrics on a nite set
- Bandelt, Dress
- 1992
(Show Context)
Citation Context ...the last 5 years in Bielefeld and which is based on the split-decomposition method, a method for decomposing metrics canonically into a sum of simpler metrics, developed jointly with H.-J. Bandelt in =-=[7]-=-. The mathematical eld devoted to structuring and/or visualizing data sets according to pregiven (or readily deduced) similarity relationships is often called cluster theory. More precisely, cluster t... |

5 |
Split Decomposition: a New Technique to Analyse Viral Evolution, PNAS
- Dopazo, Dress, et al.
- 1993
(Show Context)
Citation Context ...sTree Split decomposition has been applied successfully to numerous data sets mostly from biology and pyschology. For example, it has been applied to the evolution of the foot and mouth disease virus =-=[10]-=-, genetic relationships in human populations [2], and distinguishing sh populations [3]. Here, we give three brief examples, two from biology, and one from psychology, in order to illustrate the appli... |

3 | An order theoretic framework for overlapping clustering - Bandelt, Dress - 1994 |

3 |
Opinion: Nucleic acid sequence data are not per se reliable for inference of phylogenies
- Wagele, Wetzel
- 1993
(Show Context)
Citation Context ... populations [3]. Here, we give three brief examples, two from biology, and one from psychology, in order to illustrate the application of SplitsTree .For further and more detailed examples, see also =-=[2, 3, 6, 9, 25, 26, 21]-=-. The rst example, depicted in Fig. 7, is the splits-graph obtained from the 23S ribosomal RNA sequences of 6 archaebacteria, 6 eubacteria (including 2 chlorplasts), and 4 eukaryotes, studied by H. Le... |

2 |
Reticulate diagrams displaying genetic relationships between human populations, to appear in Annals of Human Biology
- Bandelt
(Show Context)
Citation Context ...sfully to numerous data sets mostly from biology and pyschology. For example, it has been applied to the evolution of the foot and mouth disease virus [10], genetic relationships in human populations =-=[2]-=-, and distinguishing sh populations [3]. Here, we give three brief examples, two from biology, and one from psychology, in order to illustrate the application of SplitsTree .For further and more detai... |

2 |
Phylogenetic networks, Verhandl. Naturwiss. Vereins Hamburg
- Bandelt
- 1994
(Show Context)
Citation Context ...m biology and pyschology. For example, it has been applied to the evolution of the foot and mouth disease virus [10], genetic relationships in human populations [2], and distinguishing sh populations =-=[3]-=-. Here, we give three brief examples, two from biology, and one from psychology, in order to illustrate the application of SplitsTree .For further and more detailed examples, see also [2, 3, 6, 9, 25,... |

2 | Some mathematical problems arising in molecular bioinformatics, to appear in - Dress |

2 |
Mutildimensional ratio scaling analysis of perceived color relations
- Helm
- 1964
(Show Context)
Citation Context ...ld be noted that in this example that the data set is again quite tree-like, which is re ected in the nature of the splits-graph. The nal example comes from a data set obtained in cognitivepyschology =-=[15]-=-, see also [26]. In Helm's experiment, 10 people with normal eyesight and 4 color-blind people were each asked to rank the similarity of 10 colors. The experiment went as follows: For any three colors... |

1 |
A new and useful approachtophylogenetic analysis of distance data
- Bandelt, Dress
- 1992
(Show Context)
Citation Context ... populations [3]. Here, we give three brief examples, two from biology, and one from psychology, in order to illustrate the application of SplitsTree .For further and more detailed examples, see also =-=[2, 3, 6, 9, 25, 26, 21]-=-. The rst example, depicted in Fig. 7, is the splits-graph obtained from the 23S ribosomal RNA sequences of 6 archaebacteria, 6 eubacteria (including 2 chlorplasts), and 4 eukaryotes, studied by H. Le... |

1 |
The human organism { a place to thrive for the immuno-de cincy virus
- Dress, Wetzel
- 1993
(Show Context)
Citation Context ...one which separate more than two taxa from the rest (thus producing an almost bush-like structure). The second example is an application to a data set arising from the AIDSvirus (for more details see =-=[14]-=-). The splits-graph in Fig. 8 clearly shows 15sthe evolutionary history of the AIDS-virus. While it seemingly co-evolved with the immune system of apes and monkeys, adapting to the evolutionary pressu... |

1 |
SplitsTree2 - A tool for analyzing and visualizing evolutionary data, Available from: ftp::/ftp.uni-bielefeld.de/pub/math/splits
- Huson
- 1996
(Show Context)
Citation Context ...ly supported by the data set. It is the purpose of this paper to discuss one such technique { the above mentioned split decomposition method { that has been fully implementedin the SplitsTree program =-=[17, 26]-=- and has proved useful in many di erent contexts. We explain some of the theory behind split decomposition in the next section and describe brie y our implementation in Section 3. Finally,we illustrat... |

1 |
ers et al
- Le
- 1987
(Show Context)
Citation Context ...mple, depicted in Fig. 7, is the splits-graph obtained from the 23S ribosomal RNA sequences of 6 archaebacteria, 6 eubacteria (including 2 chlorplasts), and 4 eukaryotes, studied by H. Le ers et. al. =-=[20]-=-. Biological data sets typically gives rise to slightly more splits than can be tted into a tree. This example illustrates that a large portion of these t together on a tree. In addition, the split-pr... |

1 |
Are systematic biases in plastid sequences phlyogentically informative
- Lockhart, Barbrook, et al.
- 1996
(Show Context)
Citation Context ... populations [3]. Here, we give three brief examples, two from biology, and one from psychology, in order to illustrate the application of SplitsTree .For further and more detailed examples, see also =-=[2, 3, 6, 9, 25, 26, 21]-=-. The rst example, depicted in Fig. 7, is the splits-graph obtained from the 23S ribosomal RNA sequences of 6 archaebacteria, 6 eubacteria (including 2 chlorplasts), and 4 eukaryotes, studied by H. Le... |

1 |
NEXUS: an extendible le format for systematic information
- Maddison, ord, et al.
- 1995
(Show Context)
Citation Context ...o open a le containing either a number of aligned sequences, a distance matrix describing the distances between some given taxa, or a system of splits. The application is based on the NEXUS le format =-=[22]-=-. Upon opening a le of sequences, the application rst computes the corresponding distance matrix, using one of the following transformations speci ed by the user: Hamming distances, Kimura 3ST, Jukes-... |

1 | T-Theory - An Overview, Preprint - Dress, Moulton, et al. - 1995 |

1 | Recovering the correct tree under a more realistic model of evolution, in press: Molec - Lockhart, Steel, et al. - 1994 |