Results 1  10
of
90
Better kbest parsing
, 2005
"... We discuss the relevance of kbest parsing to recent applications in natural language processing, and develop efficient algorithms for kbest trees in the framework of hypergraph parsing. To demonstrate the efficiency, scalability and accuracy of these algorithms, we present experiments on Bikel’s i ..."
Abstract

Cited by 193 (16 self)
 Add to MetaCart
We discuss the relevance of kbest parsing to recent applications in natural language processing, and develop efficient algorithms for kbest trees in the framework of hypergraph parsing. To demonstrate the efficiency, scalability and accuracy of these algorithms, we present experiments on Bikel’s implementation of Collins ’ lexicalized PCFG model, and on Chiang’s CFGbased decoder for hierarchical phrasebased translation. We show in particular how the improved output of our algorithms has the potential to improve results from parse reranking systems and other applications. 1
OpenFst: A general and efficient weighted finitestate transducer library. Implementation and Application of Automata
, 2007
"... Abstract. We describe OpenFst, an opensource library for weighted finitestate transducers (WFSTs). OpenFst consists of a C++ template library with efficient WFST representations and over twentyfive operations for constructing, combining, optimizing, and searching them. At the shellcommand level, ..."
Abstract

Cited by 105 (12 self)
 Add to MetaCart
(Show Context)
Abstract. We describe OpenFst, an opensource library for weighted finitestate transducers (WFSTs). OpenFst consists of a C++ template library with efficient WFST representations and over twentyfive operations for constructing, combining, optimizing, and searching them. At the shellcommand level, there are corresponding transducer file representations and programs that operate on them. OpenFst is designed to be both very efficient in time and space and to scale to very large problems. This library has key applications speech, image, and natural language processing, pattern and string matching, and machine learning. We give an overview of the library, examples of its use, details of its design that allow customizing the labels, states, and weights and the lazy evaluation of many of its operations. Further information and a download of the OpenFst library can be obtained from
The image foresting transform: Theory, algorithms, and applications
 IEEE TPAMI
, 2004
"... The image foresting transform (IFT) is a graphbased approach to the design of image processing operators based on connectivity. It naturally leads to correct and efficient implementations and to a better understanding of how different operators relate to each other. We give here a precise definiti ..."
Abstract

Cited by 95 (31 self)
 Add to MetaCart
(Show Context)
The image foresting transform (IFT) is a graphbased approach to the design of image processing operators based on connectivity. It naturally leads to correct and efficient implementations and to a better understanding of how different operators relate to each other. We give here a precise definition of the IFT, and a procedure to compute it—a generalization of Dijkstra’s algorithm—with a proof of correctness. We also discuss implementation issues and illustrate the use of the IFT in a few applications.
Graph Kernels
, 2007
"... We present a unified framework to study graph kernels, special cases of which include the random walk (Gärtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexit ..."
Abstract

Cited by 94 (9 self)
 Add to MetaCart
We present a unified framework to study graph kernels, special cases of which include the random walk (Gärtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexity of kernel computation between unlabeled graphs with n vertices from O(n 6) to O(n 3). We find a spectral decomposition approach even more efficient when computing entire kernel matrices. For labeled graphs we develop conjugate gradient and fixedpoint methods that take O(dn 3) time per iteration, where d is the size of the label set. By extending the necessary linear algebra to Reproducing Kernel Hilbert Spaces (RKHS) we obtain the same result for ddimensional edge kernels, and O(n 4) in the infinitedimensional case; on sparse graphs these algorithms only take O(n 2) time per iteration in all cases. Experiments on graphs from bioinformatics and other application domains show that these techniques can speed up computation of the kernel by an order of magnitude or more. We also show that certain rational kernels (Cortes et al., 2002, 2003, 2004) when specialized to graphs reduce to our random walk graph kernel. Finally, we relate our framework to Rconvolution kernels (Haussler, 1999) and provide a kernel that is close to the optimal assignment kernel of Fröhlich et al. (2006) yet provably positive semidefinite.
On trust models and trust evaluation metrics for ad hoc networks
 IEEE Journal on Selected Areas in Communications
, 2006
"... Abstract—Within the realm of network security, we interpret the concept of trust as a relation among entities that participate in various protocols. Trust relations are based on evidence created by the previous interactions of entities within a protocol. In this work, we are focusing on the evaluati ..."
Abstract

Cited by 83 (6 self)
 Add to MetaCart
(Show Context)
Abstract—Within the realm of network security, we interpret the concept of trust as a relation among entities that participate in various protocols. Trust relations are based on evidence created by the previous interactions of entities within a protocol. In this work, we are focusing on the evaluation of trust evidence in ad hoc networks. Because of the dynamic nature of ad hoc networks, trust evidence may be uncertain and incomplete. Also, no preestablished infrastructure can be assumed. The evaluation process is modeled as a path problem on a directed graph, where nodes represent entities, and edges represent trust relations. We give intuitive requirements and discuss design issues for any trust evaluation algorithm. Using the theory of semirings, we show how two nodes can establish an indirect trust relation without previous direct interaction. We show that our semiring framework is flexible enough to express other trust models, most notably PGP’s Web of Trust. Our scheme is shown to be robust in the presence of attackers. Index Terms—Trust evaluation, trust metric, trust model, semiring. I.
Trust evaluation in adhoc networks
 In 3rd ACM workshop on Wireless security
, 2004
"... An important concept in network security is trust, interpreted as a relation among entities that participate in various protocols. Trust relations are based on evidence related to the previous interactions of entities within a protocol. In this work, we are focusing on the evaluation process of trus ..."
Abstract

Cited by 71 (3 self)
 Add to MetaCart
(Show Context)
An important concept in network security is trust, interpreted as a relation among entities that participate in various protocols. Trust relations are based on evidence related to the previous interactions of entities within a protocol. In this work, we are focusing on the evaluation process of trust evidence in Ad Hoc Networks. Because of the dynamic nature of Ad Hoc Networks, trust evidence may be uncertain and incomplete. Also, no preestablished infrastructure can be assumed. The process is formulated as a path problem on a directed graph, where nodes represent entities, and edges represent trust relations. Using the theory of semirings, we show how two nodes can establish an indirect trust relation without previous direct interaction. The results are robust in the presence of attackers. We give intuitive requirements for any trust evaluation algorithm. The performance of the scheme is evaluated on three topologies.
Rational kernels: Theory and algorithms
 Journal of Machine Learning Research
, 2004
"... Many classification algorithms were originally designed for fixedsize vectors. Recent applications in text and speech processing and computational biology require however the analysis of variablelength sequences and more generally weighted automata. An approach widely used in statistical learning ..."
Abstract

Cited by 61 (8 self)
 Add to MetaCart
Many classification algorithms were originally designed for fixedsize vectors. Recent applications in text and speech processing and computational biology require however the analysis of variablelength sequences and more generally weighted automata. An approach widely used in statistical learning techniques such as Support Vector Machines (SVMs) is that of kernel methods, due to their computational efficiency in highdimensional feature spaces. We introduce a general family of kernels based on weighted transducers or rational relations, rational kernels, that extend kernel methods to the analysis of variablelength sequences or more generally weighted automata. We show that rational kernels can be computed efficiently using a general algorithm of composition of weighted transducers and a general singlesource shortestdistance algorithm. Not all rational kernels are positive definite and symmetric (PDS), or equivalently verify the Mercer condition, a condition that guarantees the convergence of training for discriminant classification algorithms such as SVMs. We present several theoretical results related to PDS rational kernels. We show that under some general conditions these kernels are
Data Transmission over Networks for Estimation and Control
"... We consider the problem of controlling a linear time invariant process when the controller is located at a location remote from where the sensor measurements are being generated. The communication from the sensor to the controller is supported by a communication network with arbitrary topology compo ..."
Abstract

Cited by 40 (8 self)
 Add to MetaCart
(Show Context)
We consider the problem of controlling a linear time invariant process when the controller is located at a location remote from where the sensor measurements are being generated. The communication from the sensor to the controller is supported by a communication network with arbitrary topology composed of analog erasure channels. Using a separation principle, we prove that the optimal LQG controller consists of an LQ optimal regulator along with an estimator that estimates the state of the process across the communication network mentioned above. We then determine the optimal information processing strategy that should be followed by each node in the network so that the estimator is able to compute the best possible estimate in the minimum mean squared error sense. The algorithm is optimal for any packetdropping process and at every time step, even though it is recursive and hence requires a constant amount of memory, processing and transmission at every node in the network per time step. For the case when the packet drop processes are memoryless and independent across links, we analyze the stability properties and the performance of the closed loop system. The algorithm is an attempt to escape the more commonly used viewpoint of treating a network of communication links as a single endtoend link with the probability of successful transmission determined by some measure of the reliability of the network. I.
Augmented Statistical Models for Classifying Sequence Data
, 2006
"... Declaration This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings [66,69], two ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Declaration This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings [66,69], two journal articles [36,68], two workshop papers [35,67] and a technical report [65]. The length of this thesis including appendices, bibliography, footnotes, tables and equations is approximately 60,000 words. This thesis contains 27 figures and 20 tables. i
Estimating the margin of victory for instantrunoff voting
 in Proceedings of the 2011 Electronic Voting Technology Workshop / Workshop on Trustworthy Elections (EVT/WOTE ’11). USENIX
, 2011
"... A general definition is proposed for the margin of victory of an election contest. That definition is applied to Instant Runoff Voting (IRV) 1 and several estimates for the IRV margin of victory are described: two upper bounds and two lower bounds. Given roundbyround vote totals, the time complexi ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
A general definition is proposed for the margin of victory of an election contest. That definition is applied to Instant Runoff Voting (IRV) 1 and several estimates for the IRV margin of victory are described: two upper bounds and two lower bounds. Given roundbyround vote totals, the time complexity for calculating these bounds does not exceed O(C 2 log C), where C is the number of candidates. It is also shown that calculating the larger and more useful of the two lower bounds can be viewed, in part, as solving a longest path problem on a weighted, directed, acyclic graph. Worstcase analysis shows that neither these estimates, nor any estimates based only on tabulation roundbyround vote totals, are guaranteed to be within a constant factor of the margin of victory. These estimates are calculated for IRV elections in Australia and California. Pseudo code for calculating these estimates is provided. 1