## Bisimulation Minimisation for Weighted Tree Automata (2007)

### Cached

### Download Links

Citations: | 8 - 6 self |

### BibTeX

@MISC{Högberg07bisimulationminimisation,

author = {Johanna Högberg and Andreas Maletti and Jonathan May},

title = {Bisimulation Minimisation for Weighted Tree Automata },

year = {2007}

}

### OpenURL

### Abstract

We generalise existing forward and backward bisimulation minimisation algorithms for tree automata to weighted tree automata. The obtained algorithms work for all semirings and retain the time complexity of their unweighted variants for all additively cancellative semirings. On all other semirings the time complexity is slightly higher (linear instead of logarithmic in the number of states). We discuss implementations of these algorithms on a typical task in natural language processing.

### Citations

2429 | Computational complexity
- Papadimitriou
- 1994
(Show Context)
Citation Context ...ree representation is sparse is briefly discussed in Section 4.3. We also assume that semiring addition can be performed in constant time. As our computation model we choose the random access machine =-=[35]-=-, which supports indirect addressing, and thus allows the use of pointers. This means that we can represent each block in a partition of Q with respect to R as a record of two-way pointers to its elem... |

2235 | Building a Large Annotated Corpus for English: The Penn Treebank
- Marcus, Santorini, et al.
- 1993
(Show Context)
Citation Context ...represent are subtrees. We prepared a data set by collecting 3-subtrees, i.e. all subtrees of height 3, from sentences taken from the Penn Treebank corpus of syntactically bracketed English news text =-=[22]-=-, and collected observation statistics on these subtrees, which we stored as probabilities. In our experiments, we selected at random a subset of these subtrees and constructed an initial wta over the... |

1418 |
A Calculus of Communicating Systems
- Milner
- 1980
(Show Context)
Citation Context ...ring every hope of a non-trivial approximation bound. Algorithms that minimise with respect to a bisimulation are examples of the latter approach. The concept of bisimularity was introduced by Milner =-=[4]-=- as a formal tool to investigate transition systems. Simply put, two transition systems are bisimulation equivalent if their behaviour—in response to a sequence of actions—cannot be distinguished by a... |

384 |
Three partition refinement algorithms
- Paige, Tarjan
- 1987
(Show Context)
Citation Context ...r the coarsest relation on the state space that meets the local conditions of the bisimulation relation that we are interested in. The O � n 2 log n � minimisation algorithm for nfa by Paige & Tarjan =-=[15]-=- could be called a forward bisimulation minimisation. Bisimulation minimisation of tree automata is discussed in [10]. The paper [10] presents two minimisation algorithms that are based on forward and... |

320 | Finite-State Transducers in Language and Speech Processing
- Mohri
- 1997
(Show Context)
Citation Context ...nsformation of elaborate syntax-based natural language systems into simple sequences of tree acceptors and transducers [21]. And as the development of efficient general algorithms for string automata =-=[31, 33]-=- and their subsequent incorporation in toolkits [32] made the rapidly-constructed string-based systems possible, so too is there ongoing development of tree-based elevations of these efficient algorit... |

301 |
An nlog(n) Algorithm for Minimizing the States in a Finite Automaton
- Hopcroft
- 1971
(Show Context)
Citation Context ...hill-Nerode theorem there exists, for every regular string language L, a unique (up to isomorphism) minimal deterministic finite automaton (dfa) that recognises L. It was a breakthrough when Hopcroft =-=[1]-=- presented an O(n log n) minimisation algorithm for dfa where n is the number of states. This still up-todate bound was obtained by partitioning the state space through a “process the smaller half” st... |

284 | A Syntax-based Statistical Translation Model
- Yamada, Knight
- 2001
(Show Context)
Citation Context ... 25 162 162 141 161 136 136 115 135 45 295 295 248 290 209 209 161 203 85 526 526 436 516 365 365 271 351 165 1087 1087 899 1054 672 672 468 623 305 1996 1996 1630 1924 1143 1143 735 1029 translation =-=[20, 21]-=-. We thus require a language model of trees, and the subsequences we will represent are subtrees. We prepared a data set by collecting 3-subtrees, i.e. all subtrees of height 3, from sentences taken f... |

261 | Tree automata techniques and application. Available on the Web from 13ux02.univ-lille.fr in directoty - Comon, Daucher, et al. - 1998 |

246 | 2004. What’s in a translation rule
- Galley, Hopkins, et al.
(Show Context)
Citation Context ... 25 162 162 141 161 136 136 115 135 45 295 295 248 290 209 209 161 203 85 526 526 436 516 365 365 271 351 165 1087 1087 899 1054 672 672 468 623 305 1996 1996 1630 1924 1143 1143 735 1029 translation =-=[20, 21]-=-. We thus require a language model of trees, and the subsequences we will represent are subtrees. We prepared a data set by collecting 3-subtrees, i.e. all subtrees of height 3, from sentences taken f... |

240 |
Tree automata. Akademiai Kiado
- Gecseg, Steinby
- 1984
(Show Context)
Citation Context ..., as interpreted for various devices, implies language equality, the opposite does not hold in general. We consider weighted tree automata (wta) [5], which are a joint generalisation of tree automata =-=[6, 7]-=- and weighted automata [8]. ⋆ This work was partially supported by NSF grant IIS-0428020sClassical tree automata can then be seen as wta with weights in the Boolean semiring, i.e. a transition has wei... |

195 | Continuous speech recognition by statistical methods
- Jelinek
- 1976
(Show Context)
Citation Context ...is section we present experimental results obtained by applying an implementation (written in Perl) of Alg. 1 and Alg. 2 to the problem of language modelling in the natural language processing domain =-=[19]-=-. A language model is a formalism for determining whether a given sentence is in a particular language. Language models are particularly useful in applications of natural language and speech processin... |

187 |
The equivalence problem for regular expressions with squaring requires exponential space
- Meyer, Stockmeyer
- 1972
(Show Context)
Citation Context ...ce through a “process the smaller half” strategy. However, in general there exists no unique minimal nondeterministic finite automaton (nfa) recognising a given regular language. Meyer and Stockmeyer =-=[2]-=- proved that minimisation of nfa is PSPACE-complete. The minimisation problem for nfa with n states cannot even be efficiently approximated within the factor o(n), unless P = PSPACE [3]. This meant th... |

157 | Weighted finite-state transducers in speech recognition
- Mohri, Pereira, et al.
- 2000
(Show Context)
Citation Context ...nsformation of elaborate syntax-based natural language systems into simple sequences of tree acceptors and transducers [21]. And as the development of efficient general algorithms for string automata =-=[31, 33]-=- and their subsequent incorporation in toolkits [32] made the rapidly-constructed string-based systems possible, so too is there ongoing development of tree-based elevations of these efficient algorit... |

141 |
Tree languages
- Gécseg, Steinby
- 1997
(Show Context)
Citation Context ..., as interpreted for various devices, implies language equality, the opposite does not hold in general. We consider weighted tree automata (wta) [5], which are a joint generalisation of tree automata =-=[6, 7]-=- and weighted automata [8]. ⋆ This work was partially supported by NSF grant IIS-0428020sClassical tree automata can then be seen as wta with weights in the Boolean semiring, i.e. a transition has wei... |

127 | Speech recognition by composition of weighted finite automata
- Pereira, Riley
- 1997
(Show Context)
Citation Context ...ms that operate on words and phrases as 4schains of nondeterministic acceptors and transducers has been shown to be useful for rapid system construction and greater understanding and modularity (e.g. =-=[24, 3, 38, 36]-=-). In more recent work, however, there has been a desire to use models that consider syntactic structure as an elementary unit, and thus tree automata are a more natural fit than the previously exploi... |

67 |
2005. An overview of probabilistic tree transducers for natural language processing
- Knight, Graehl
(Show Context)
Citation Context ...t than the previously exploited string-based automata. Work has progressed on transformation of elaborate syntax-based natural language systems into simple sequences of tree acceptors and transducers =-=[21]-=-. And as the development of efficient general algorithms for string automata [31, 33] and their subsequent incorporation in toolkits [32] made the rapidly-constructed string-based systems possible, so... |

67 | A rational design for a weighted finite-state transducer library
- Mohri, Pereira, et al.
- 1998
(Show Context)
Citation Context ...ystems into simple sequences of tree acceptors and transducers [21]. And as the development of efficient general algorithms for string automata [31, 33] and their subsequent incorporation in toolkits =-=[32]-=- made the rapidly-constructed string-based systems possible, so too is there ongoing development of tree-based elevations of these efficient algorithms [29] with the goal of constructing a tree automa... |

58 | Recognizable formal power series on trees
- Berstel, Reutenauer
- 1982
(Show Context)
Citation Context ... an outside observer. Although bisimulation equivalence, as interpreted for various devices, implies language equality, the opposite does not hold in general. We consider weighted tree automata (wta) =-=[5]-=-, which are a joint generalisation of tree automata [6, 7] and weighted automata [8]. ⋆ This work was partially supported by NSF grant IIS-0428020sClassical tree automata can then be seen as wta with ... |

44 |
Formal power series over trees
- Kuich
- 1997
(Show Context)
Citation Context ...de observer. Although bisimulation equivalence, as interpreted for various devices, implies language equality, the opposite does not hold in general. 1.2 Weighted tree automata Weighted tree automata =-=[4, 23, 6]-=- generalise both finite tree automata [14, 15] and weighted string automata: classical tree automata can be seen as weighted tree automata with weights in the Boolean semiring, i.e. a transition has w... |

34 | Tiburon: A Weighted Tree Automata Toolkit
- May, Knight
- 2006
(Show Context)
Citation Context ...apidly-constructed string-based systems possible, so too is there ongoing development of tree-based elevations of these efficient algorithms [29] with the goal of constructing a tree automata toolkit =-=[28]-=- that will benefit from the contributions of this work. Both minimisation algorithms have been implemented as prototypes in Perl. There are advantages that support having two types of bisimulation min... |

33 | A weighted finite state transducer implementation of the alignment template model for statistical machine translation
- Kumar, Byrne
- 2002
(Show Context)
Citation Context ...ms that operate on words and phrases as 4schains of nondeterministic acceptors and transducers has been shown to be useful for rapid system construction and greater understanding and modularity (e.g. =-=[24, 3, 38, 36]-=-). In more recent work, however, there has been a desire to use models that consider syntactic structure as an elementary unit, and thus tree automata are a more natural fit than the previously exploi... |

28 | G.Schnitger, “Minimizing NFA’s and Regular Expressions
- Gramlich
- 2005
(Show Context)
Citation Context ...and Stockmeyer [2] proved that minimisation of nfa is PSPACE-complete. The minimisation problem for nfa with n states cannot even be efficiently approximated within the factor o(n), unless P = PSPACE =-=[3]-=-. This meant that the problem had to be simplified; either by restricting the domain to a smaller class of devices, or by surrendering every hope of a non-trivial approximation bound. Algorithms that ... |

27 | 2000, ‘Stochastic Finite-State Models for Spoken Language
- Bangalore, Riccardi
(Show Context)
Citation Context ...ms that operate on words and phrases as 4schains of nondeterministic acceptors and transducers has been shown to be useful for rapid system construction and greater understanding and modularity (e.g. =-=[24, 3, 38, 36]-=-). In more recent work, however, there has been a desire to use models that consider syntactic structure as an elementary unit, and thus tree automata are a more natural fit than the previously exploi... |

22 |
A Better N-Best List: Practical Determinization of Weighted Finite Tree Automata
- May, Knight
- 2006
(Show Context)
Citation Context ... their subsequent incorporation in toolkits [32] made the rapidly-constructed string-based systems possible, so too is there ongoing development of tree-based elevations of these efficient algorithms =-=[29]-=- with the goal of constructing a tree automata toolkit [28] that will benefit from the contributions of this work. Both minimisation algorithms have been implemented as prototypes in Perl. There are a... |

17 |
Bisimulation relations for weighted automata
- Buchholz
- 2008
(Show Context)
Citation Context ...utomata can then be seen as wta with weights in the Boolean semiring, i.e. a transition has weight true if it is present, and false otherwise. One type of bisimulation, called forward bisimulation in =-=[9, 10]-=-, restricts bisimilar states to have identical futures. The future of a state q is the tree series of contexts that is recognised by the wta if the computation starts with the state q and weight 1 at ... |

17 |
The Myhill-Nerode theorem for recognizable tree series
- Borchardt
(Show Context)
Citation Context ...d weight 1 at the unique position of the special symbol ✷ in the context. A similar condition is found in the Myhill-Nerode congruence for a tree language [11] or even in the Myhill-Nerode congruence =-=[12]-=- for a tree series. Let us explain it on the latter. Two trees t and u are equal in the MyhillNerode congruence for a given tree series S over the field (A, +, ·, 0, 1), if there exist nonzero coeffic... |

17 | 2003. Determinization of finite state weighted tree automata
- Borchardt, Vogler
(Show Context)
Citation Context ...de observer. Although bisimulation equivalence, as interpreted for various devices, implies language equality, the opposite does not hold in general. 1.2 Weighted tree automata Weighted tree automata =-=[4, 23, 6]-=- generalise both finite tree automata [14, 15] and weighted string automata: classical tree automata can be seen as weighted tree automata with weights in the Boolean semiring, i.e. a transition has w... |

15 | Bisimulation Minimization of Tree Automata
- Abdulla, Högberg, et al.
- 2006
(Show Context)
Citation Context ...misation than forward. When presented with an unknown wta, we know no way to say for certain which method of minimisation is superior, so it is beneficial to have both. The bisimulation introduced in =-=[16]-=- can be seen as a combination of backward and forward bisimulation. Containing the restrictions of both, it is less efficient than backward bisimulation when applied to the minimisation of nondetermin... |

12 |
Effective construction of the syntactic algebra of a recognizable series on trees
- Bozapalidis
- 1991
(Show Context)
Citation Context ...t is � q∈Q F(q) · hµ(t)q. A tree series that can be computed by a wta is called recognisable. The minimisation of the representation of a recognisable tree series over fields is already considered in =-=[9, 8]-=-, but a formal complexity analysis is missing. In this report, we will consider procedures for arbitrary commutative semirings. Since minimisation is PSPACE complete in certain commutative semirings, ... |

11 | Backward and Forward Bisimulation Minimisation of Tree Automata
- Högberg, Maletti, et al.
- 2007
(Show Context)
Citation Context ...utomata can then be seen as wta with weights in the Boolean semiring, i.e. a transition has weight true if it is present, and false otherwise. One type of bisimulation, called forward bisimulation in =-=[9, 10]-=-, restricts bisimilar states to have identical futures. The future of a state q is the tree series of contexts that is recognised by the wta if the computation starts with the state q and weight 1 at ... |

10 |
Manning and Hinrich Schütze
- Christopher
- 1999
(Show Context)
Citation Context .... Language models are typically formed by collecting subsequences of sentences over a large corpus of text and assigning probabilities to the subsequences based on their occurrence counts in the data =-=[20, 26]-=-. To obtain the probability of a sentence one multiplies the probability of subsequences together. It is thus useful to have a data structure for efficiently looking up many subsequences. As effective... |

9 | On the Myhill-Nerode theorem for trees
- Kozen
- 1992
(Show Context)
Citation Context ...if the computation starts with the state q and weight 1 at the unique position of the special symbol ✷ in the context. A similar condition is found in the Myhill-Nerode congruence for a tree language =-=[11]-=- or even in the Myhill-Nerode congruence [12] for a tree series. Let us explain it on the latter. Two trees t and u are equal in the MyhillNerode congruence for a given tree series S over the field (A... |

7 |
Determinization of Finite State Weighted Tree Automata
- Borchardt, Vogler
- 2003
(Show Context)
Citation Context ...de observer. Although bisimulation equivalence, as interpreted for various devices, implies language equality, the opposite does not hold in general. 1.2 Weighted tree automata Weighted tree automata =-=[4, 23, 6]-=- generalise both finite tree automata [14, 15] and weighted string automata: classical tree automata can be seen as weighted tree automata with weights in the Boolean semiring, i.e. a transition has w... |

7 |
Représentations matricielles des séries d’arbre reconnaissables
- Bozapalidis, Alexandrakis
- 1989
(Show Context)
Citation Context ...t is � q∈Q F(q) · hµ(t)q. A tree series that can be computed by a wta is called recognisable. The minimisation of the representation of a recognisable tree series over fields is already considered in =-=[9, 8]-=-, but a formal complexity analysis is missing. In this report, we will consider procedures for arbitrary commutative semirings. Since minimisation is PSPACE complete in certain commutative semirings, ... |

6 |
Learning deterministically recognizable tree series
- Drewes, Vogler
(Show Context)
Citation Context ...quires a local condition on the tree representation. The condition is strong enough to enforce equivalent futures, but not too strong which is shown by the fact that, on a deterministic all-accepting =-=[13]-=- wta M over a field [14] or a wta M over the Boolean semiring [10], minimisation via forward bisimulation yields the unique (up to isomorphism) minimal deterministic wta that recognises the same tree ... |

4 |
Bisimulation relations for weighted automata. unpublished
- Buchholz
- 2007
(Show Context)
Citation Context ... a search for the coarsest relation on the state space that meets the local conditions of the bisimulation relation that we are interested in. One type of bisimulation, called forward bisimulation in =-=[10, 17]-=-, restricts bisimilar states to have identical futures. The future of a state q is the tree series of contexts that is recognised by the wta if the computation starts with the state q and weight 1 at ... |

4 |
A fast and memory-efficient phrase-based approach to statistical machine translation
- Folsom
(Show Context)
Citation Context |

3 |
The Theory of Recognizable Tree Series. Akademische Abhandlungen zur Informatik. Verlag für Wissenschaft und Forschung
- Borchardt
- 2005
(Show Context)
Citation Context ... ·, 0, 1) is a mapping from TΣ to A. The set of all tree series over Σ and A is denoted by A〈〈TΣ〉〉. Let S ∈ A〈〈TΣ〉〉. We write (S, t) with t ∈ TΣ for S(t). A weighted tree automaton M (for short: wta) =-=[17]-=- is a tuple (Q, Σ, A, F, µ), where Q is a finite nonempty set of states; Σ is a ranked alphabet (of input symbols); A = (A, +, ·, 0, 1) is a semiring; F ∈ A Q is a final weight distribution; and µ = (... |

3 | On the Myhill-Nerode theorem for trees. Bulletin of the EATCS - Kozen - 1992 |

2 | Myhill-Nerode theorem for recognizable tree series — revisited. unpublished
- Maletti
- 2007
(Show Context)
Citation Context ...g [12, Section 3.2] wta over a multiplicatively cancellative and commutative semiring, there exists a unique (up to isomorphism) minimal deterministic and complete all-accepting wta that recognises S =-=[12, 25]-=-. First we need to recall the concept of a dead state. Let P be a subset of the state space Q. We call the states of P dead states if it holds that • F(p) = 0 for every p ∈ P; and • for every symbol σ... |

1 |
Volume 59 of Pure and Applied Mathematics
- Eilenberg, Languages, et al.
- 1974
(Show Context)
Citation Context ...devices, implies language equality, the opposite does not hold in general. We consider weighted tree automata (wta) [5], which are a joint generalisation of tree automata [6, 7] and weighted automata =-=[8]-=-. ⋆ This work was partially supported by NSF grant IIS-0428020sClassical tree automata can then be seen as wta with weights in the Boolean semiring, i.e. a transition has weight true if it is present,... |

1 | Theoretical Aspects of Computer Science. Volume 3404 of LNCS., Springer Verlag (2005) 399-411 4. Milner, R.: A Calculus of Communicating Systems - Symp - 1982 |

1 | Developments in Language Theory. Volume 2710 of LNCS - Conf - 2003 |

1 | Implementation and Application of Automata. Volume 4094 of LNCS., Springer Verlag (2006) 173-185 17. Borchardt, B.: The Theory of Recognizable Tree Series. Akademische Abhandlungen zur Informatik. Verlag f"ur Wissenschaft und Forschung (2005 - Conf |

1 | Jelinek, F.: Continuous speech recognition by statistical methods - Galley, Hopkins, et al. - 2006 |