Results 1  10
of
280
Nonprojective dependency parsing using spanning tree algorithms
 In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing
, 2005
"... We formalize weighted dependency parsing as searching for maximum spanning trees (MSTs) in directed graphs. Using this representation, the parsing algorithm of Eisner (1996) is sufficient for searching over all projective trees in O(n 3) time. More surprisingly, the representation is extended natura ..."
Abstract

Cited by 377 (10 self)
 Add to MetaCart
(Show Context)
We formalize weighted dependency parsing as searching for maximum spanning trees (MSTs) in directed graphs. Using this representation, the parsing algorithm of Eisner (1996) is sufficient for searching over all projective trees in O(n 3) time. More surprisingly, the representation is extended naturally to nonprojective parsing using ChuLiuEdmonds (Chu and Liu, 1965; Edmonds, 1967) MST algorithm, yielding an O(n 2) parsing algorithm. We evaluate these methods on the Prague Dependency Treebank using online largemargin learning techniques (Crammer et al., 2003; McDonald et al., 2005) and show that MST parsing increases efficiency and accuracy for languages with nonprojective dependencies. 1
A Survey of Optimization by Building and Using Probabilistic Models
 COMPUTATIONAL OPTIMIZATION AND APPLICATIONS
, 1999
"... This paper summarizes the research on populationbased probabilistic search algorithms based on modeling promising solutions by estimating their probability distribution and using the constructed model to guide the further exploration of the search space. It settles the algorithms in the field of ge ..."
Abstract

Cited by 338 (89 self)
 Add to MetaCart
This paper summarizes the research on populationbased probabilistic search algorithms based on modeling promising solutions by estimating their probability distribution and using the constructed model to guide the further exploration of the search space. It settles the algorithms in the field of genetic and evolutionary computation where they have been originated. All methods are classified into a few classes according to the complexity of the class of models they use. Algorithms from each of these classes are briefly described and their strengths and weaknesses are discussed.
CoNLLX shared task on multilingual dependency parsing
 In Proc. of CoNLL
, 2006
"... Each year the Conference on Computational Natural Language Learning (CoNLL) 1 features a shared task, in which participants train and test their systems on exactly the same data sets, in order to better compare systems. The tenth CoNLL (CoNLLX) saw a shared task on Multilingual Dependency Parsing. ..."
Abstract

Cited by 333 (2 self)
 Add to MetaCart
(Show Context)
Each year the Conference on Computational Natural Language Learning (CoNLL) 1 features a shared task, in which participants train and test their systems on exactly the same data sets, in order to better compare systems. The tenth CoNLL (CoNLLX) saw a shared task on Multilingual Dependency Parsing. In this paper, we describe how treebanks for 13 languages were converted into the same dependency format and how parsing performance was measured. We also give an overview of the parsing approaches that participants took and the results that they achieved. Finally, we try to draw general conclusions about multilingual parsing: What makes a particular language, treebank or annotation scheme easier or harder to parse and which phenomena are challenging for any dependency parser? Acknowledgement Many thanks to Amit Dubey and Yuval Krymolowski, the other two organizers of the shared task, for discussions, converting treebanks, writing software and helping with the papers. 2
Hierarchical Bayesian Optimization Algorithm = Bayesian Optimization Algorithm + Niching + Local Structures
, 2001
"... The paper describes the hierarchical Bayesian optimization algorithm which combines the Bayesian optimization algorithm, local structures in Bayesian networks, and a powerful niching technique. The proposed algorithm is able to solve hierarchical traps and other difficult problems very efficiently. ..."
Abstract

Cited by 327 (70 self)
 Add to MetaCart
(Show Context)
The paper describes the hierarchical Bayesian optimization algorithm which combines the Bayesian optimization algorithm, local structures in Bayesian networks, and a powerful niching technique. The proposed algorithm is able to solve hierarchical traps and other difficult problems very efficiently.
Online Learning of Approximate Dependency Parsing Algorithms
 In Proc. of EACL
, 2006
"... In this paper we extend the maximum spanning tree (MST) dependency parsing framework of McDonald et al. (2005c) to incorporate higherorder feature representations and allow dependency structures with multiple parents per word. We show that those extensions can make the MST framework computationally ..."
Abstract

Cited by 213 (11 self)
 Add to MetaCart
(Show Context)
In this paper we extend the maximum spanning tree (MST) dependency parsing framework of McDonald et al. (2005c) to incorporate higherorder feature representations and allow dependency structures with multiple parents per word. We show that those extensions can make the MST framework computationally intractable, but that the intractability can be circumvented with new approximate parsing algorithms. We conclude with experiments showing that discriminative online learning using those approximate algorithms achieves the best reported parsing accuracy for Czech and Danish. 1
Simple semisupervised dependency parsing
 In Proc. ACL/HLT
, 2008
"... We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dep ..."
Abstract

Cited by 173 (9 self)
 Add to MetaCart
(Show Context)
We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that the clusterbased features yield substantial gains in performance across a wide range of conditions. For example, in the case of English unlabeled secondorder parsing, we improve from a baseline accuracy of 92.02 % to 93.16%, and in the case of Czech unlabeled secondorder parsing, we improve from a baseline accuracy of 86.13% to 87.13%. In addition, we demonstrate that our method also improves performance when small amounts of training data are available, and can roughly halve the amount of supervised data required to reach a desired level of performance. 1
THE PRIMALDUAL METHOD FOR APPROXIMATION ALGORITHMS AND ITS APPLICATION TO NETWORK DESIGN PROBLEMS
"... The primaldual method is a standard tool in the design of algorithms for combinatorial optimization problems. This chapter shows how the primaldual method can be modified to provide good approximation algorithms for a wide variety of NPhard problems. We concentrate on results from recent researc ..."
Abstract

Cited by 142 (5 self)
 Add to MetaCart
The primaldual method is a standard tool in the design of algorithms for combinatorial optimization problems. This chapter shows how the primaldual method can be modified to provide good approximation algorithms for a wide variety of NPhard problems. We concentrate on results from recent research applying the primaldual method to problems in network design.
Inferring Networks of Diffusion and Influence
, 2010
"... Information diffusion and virus propagation are fundamental processes talking place in networks. While it is often possible to directly observe when nodes become infected, observing individual transmissions (i.e., who infects whom or who influences whom) is typically very difficult. Furthermore, in ..."
Abstract

Cited by 112 (13 self)
 Add to MetaCart
Information diffusion and virus propagation are fundamental processes talking place in networks. While it is often possible to directly observe when nodes become infected, observing individual transmissions (i.e., who infects whom or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NPhard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and in practice gives provably nearoptimal performance. We demonstrate the effectiveness of our approach by tracing information cascades in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news tends to have a coreperiphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.
Bayesian Optimization Algorithm: From Single Level to Hierarchy
, 2002
"... There are four primary goals of this dissertation. First, design a competent optimization algorithm capable of learning and exploiting appropriate problem decomposition by sampling and evaluating candidate solutions. Second, extend the proposed algorithm to enable the use of hierarchical decompositi ..."
Abstract

Cited by 101 (19 self)
 Add to MetaCart
(Show Context)
There are four primary goals of this dissertation. First, design a competent optimization algorithm capable of learning and exploiting appropriate problem decomposition by sampling and evaluating candidate solutions. Second, extend the proposed algorithm to enable the use of hierarchical decomposition as opposed to decomposition on only a single level. Third, design a class of difficult hierarchical problems that can be used to test the algorithms that attempt to exploit hierarchical decomposition. Fourth, test the developed algorithms on the designed class of problems and several realworld applications. The dissertation proposes the Bayesian optimization algorithm (BOA), which uses Bayesian networks to model the promising solutions found so far and sample new candidate solutions. BOA is theoretically and empirically shown to be capable of both learning a proper decomposition of the problem and exploiting the learned decomposition to ensure robust and scalable search for the optimum across a wide range of problems. The dissertation then identifies important features that must be incorporated into the basic BOA to solve problems that are not decomposable on a single level, but that can still be solved by decomposition over multiple levels of difficulty. Hierarchical