Results 1 
5 of
5
Combinatorial algorithms for DNA sequence assembly
 Algorithmica
, 1993
"... The trend towards very large DNA sequencing projects, such as those being undertaken as part of the human genome initiative, necessitates the development of efficient and precise algorithms for assembling a long DNA sequence from the fragments obtained by shotgun sequencing or other methods. The seq ..."
Abstract

Cited by 45 (3 self)
 Add to MetaCart
The trend towards very large DNA sequencing projects, such as those being undertaken as part of the human genome initiative, necessitates the development of efficient and precise algorithms for assembling a long DNA sequence from the fragments obtained by shotgun sequencing or other methods. The sequence reconstruction problem that we take as our formulation of DNA sequence assembly is a variation of the shortest common superstring problem, complicated by the presence of sequencing errors and reverse complements of fragments. Since the simpler superstring problem is NPhard, any efficient reconstruction procedure must resort to heuristics. In this paper, however, a four phase approach based on rigorous design criteria is presented, and has been found to be very accurate in practice. Our method is robust in the sense that it can accommodate high sequencing error rates and list a series of alternate solutions in the event that several appear equally good. Moreover it uses a limited form ...
Discriminative Learning and Spanning Tree Algorithms for Dependency Parsing
, 2006
"... In this thesis we develop a discriminative learning method for dependency parsing using
online largemargin training combined with spanning tree inference algorithms. We will
show that this method provides stateoftheart accuracy, is extensible through the feature
set and can be implemented effici ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
In this thesis we develop a discriminative learning method for dependency parsing using
online largemargin training combined with spanning tree inference algorithms. We will
show that this method provides stateoftheart accuracy, is extensible through the feature
set and can be implemented efficiently. Furthermore, we display the language independent
nature of the method by evaluating it on over a dozen diverse languages as well as show its
practical applicability through integration into a sentence compression system.
We start by presenting an online largemargin learning framework that is a generaliza
tion of the work of Crammer and Singer [34, 37] to structured outputs, such as sequences
and parse trees. This will lead to the heart of this thesis – discriminative dependency pars
ing. Here we will formulate dependency parsing in a spanning tree framework, yielding
efficient parsing algorithms for both projective and nonprojective tree structures. We will
then extend the parsing algorithm to incorporate features over larger substructures with
out an increase in computational complexity for the projective case. Unfortunately, the
nonprojective problem then becomes NPhard so we provide structurally motivated ap
proximate algorithms. Having defined a set of parsing algorithms, we will also define a
rich feature set and train various parsers using the online largemargin learning framework.
We then compare our trained dependency parsers to other stateoftheart parsers on 14
diverse languages: Arabic, Bulgarian, Chinese, Czech, Danish, Dutch, English, German,
Japanese, Portuguese, Slovene, Spanish, Swedish and Turkish.
Having built an efficient and accurate discriminative dependency parser, this thesis will
then turn to improving and applying the parser. First we will show how additional re
sources can provide useful features to increase parsing accuracy and to adapt parsers to
new domains. We will also argue that the robustness of discriminative inferencebased
learning algorithms lend themselves well to dependency parsing when feature representa
tions or structural constraints do not allow for tractable parsing algorithms. Finally, we
integrate our parsing models into a stateoftheart sentence compression system to show
its applicability to a real world problem.
A Global Structural EM Algorithm for a Model of Cancer Progression
"... Cancer has complex patterns of progression that include converging as well as diverging progressional pathways. Vogelstein’s path model of colon cancer was a pioneering contribution to cancer research. Since then, several attempts have been made at obtaining mathematical models of cancer progression ..."
Abstract
 Add to MetaCart
Cancer has complex patterns of progression that include converging as well as diverging progressional pathways. Vogelstein’s path model of colon cancer was a pioneering contribution to cancer research. Since then, several attempts have been made at obtaining mathematical models of cancer progression, devising learning algorithms, and applying these to crosssectional data. Beerenwinkel et al. provided, what they coined, EMlike algorithms for Oncogenetic Trees (OTs) and mixtures of such. Given the small size of current and future data sets, it is important to minimize the number of parameters of a model. For this reason, we too focus on treebased models and introduce Hiddenvariable Oncogenetic Trees (HOTs). In contrast to OTs, HOTs allow for errors in the data and thereby provide more realistic modeling. We also design global structural EM algorithms for learning HOTs and mixtures of HOTs (HOTmixtures). The algorithms are global in the sense that, during the Mstep, they find a structure that yields a global maximum of the expected complete loglikelihood rather than merely one that improves it. The algorithm for single HOTs performs very well on reasonablesized data sets, while that for HOTmixtures requires data sets of sizes obtainable only with tomorrow’s more costefficient technologies. 1
Lecture notes on “Analysis of Algorithms”: Directed Minimum Spanning Trees (More complete but still unfinished)
, 2013
"... We describe an efficient implementation of Edmonds ’ algorithm for finding minimum directed spanning trees in directed graphs. 1 Minimum Directed Spanning Trees Let G = (V, E, w) be a weighted directed graph, where w: E → R is a cost (or weight) function defined on its edges. Let r ∈ V. A directed s ..."
Abstract
 Add to MetaCart
We describe an efficient implementation of Edmonds ’ algorithm for finding minimum directed spanning trees in directed graphs. 1 Minimum Directed Spanning Trees Let G = (V, E, w) be a weighted directed graph, where w: E → R is a cost (or weight) function defined on its edges. Let r ∈ V. A directed spanning tree (DST) of G rooted at r, is a subgraph T of G such that the undirected version of T is a tree and T contains a directed path from r to any other vertex in V. The cost w(T) of a directed spanning tree T is the sum of the costs of its edges, i.e., w(T) = ∑ e∈T w(e). A minimum directed spanning tree (MDST) rooted at r is a directed spanning tree rooted at r of minimum cost. A directed graph contains a directed spanning tree rooted at r if and only if all vertices in G are reachable from r. This condition can be easily tested in linear time. The proof of the following lemma is trivial as is left as an exercise. Lemma 1.1 The following conditions are equivalent: (i) T is a directed spanning tree of G rooted at r. (ii) The indegree of r in T is 0, the indegree of every other vertex of G in T is 1, and T is acyclic, i.e., contains no directed cycles. (iii) The indegree of r in T is 0, the indegree of every other vertex of G in T is 1, and there are directed paths in T from r to all other vertices.
KBest Spanning Tree Dependency Parsing With Verb Valency Lexicon Reranking
"... A novel method for hybrid graphbased dependency parsing of natural language text is proposed. It is based on kbest maximum spanning tree dependency parsing and evaluation of the spanning trees by using a verb valency lexicon for a given language as a reranking knowledge base. The approach is compa ..."
Abstract
 Add to MetaCart
A novel method for hybrid graphbased dependency parsing of natural language text is proposed. It is based on kbest maximum spanning tree dependency parsing and evaluation of the spanning trees by using a verb valency lexicon for a given language as a reranking knowledge base. The approach is compared with existing stateoftheart transitionbased and graphbased approaches to dependency parsing. As the proposed generic method was developed specifically for improving the accuracy of Croatian dependency parsing, Croatian Dependency Treebank and CROVALLEX verb valency lexicon are used in the experiment. The suggested approach scored approximately 77.21 % LAS, outperforming the tested stateoftheart approaches by at least 2.68 % LAS. TITLE AND ABSTRACT IN CROATIAN Ovisnosno parsanje pomoću k najboljih razapinjućih stabala i ponovnoga vrjednovanja valencijskim rječnikom glagola Predlaže se novi pristup hibridnom ovisnosnom parsanju tekstova prirodnoga jezika temeljenom na teoriji grafova. Pristup je zasnovan na ovisnosnom parsanju pomoću k najboljih razapinjućih