A generalized CYK algorithm for parsing stochastic CFG
, 1998
Abstract

We present a bottomup parsing algorithm for stochastic contextfree grammars that is able (1) to deal with multiple interpretations of sentences containing compound words; (2) to extract Nmost probable parses in O(n 3 ) and compute at the same time all possible parses of any portion of the input sequence with their probabilities; (3) to deal with #out of vocabulary# words. Explicitly extracting all the parse trees associated to a given input sentence depends on the complexity of the grammar, but even in the case where this number is exponential in n, the chart used by the algorithm for the representation is of O(n 2 ) space complexity. 1 Introduction This article presents CYK+, a bottomup parsing algorithm for stochastic contextfree grammars that is able: 1. to deal multiple interpretations of sentences containing compound words; 2. to extract Nmost probable parses in O(n 3 ) and compute at the same time all possible parses of any portion of the input sequence with their p...