Results 1 
3 of
3
Acronymmeaning extraction from corpora using multitape weighted finitestate machines
 CoRR
"... The automatic extraction of acronyms and their meaning from corpora is an important subtask of text mining. It can be seen as a special case of string alignment, where a text chunk is aligned with an acronym. Alternative alignments have different cost, and ideally the least costly one should give t ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The automatic extraction of acronyms and their meaning from corpora is an important subtask of text mining. It can be seen as a special case of string alignment, where a text chunk is aligned with an acronym. Alternative alignments have different cost, and ideally the least costly one should give the correct meaning of the acronym. We show how this approach can be implemented by means of a 3tape weighted finitestate machine (3WFSM) which reads a text chunk on tape 1 and an acronym on tape 2, and generates all alternative alignments on tape 3. The 3WFSM can be automatically generated from a simple regular expression. No additional algorithms are required at any stage. Our 3WFSM has a size of 27 states and 64 transitions, and finds the best analysis of an acronym in a few milliseconds. 1
A class of rational nWFSM autointersections
 in Proc. Conf. Impl. and Appl. of Automata, Sophia Antipolis
, 2005
"... Abstract. Weighted finitestate machines with n tapes describe nary rational string relations. The join nary relation is very important regarding to applications. It is shown how to compute it via a more simple operation, the autointersection. Join and autointersection generally do not preserve ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Weighted finitestate machines with n tapes describe nary rational string relations. The join nary relation is very important regarding to applications. It is shown how to compute it via a more simple operation, the autointersection. Join and autointersection generally do not preserve rationality. We define a class of triples 〈A,i, j 〉 such that the autointersection of the machine A w.r.t. tapes i and j can be computed by a delaybased algorithm. We point out how to extend this class and hope that it is sufficient for many practical applications. 1
Viterbi Algorithm Generalized for nTape BestPath Search
, 2006
"... We present a generalization of the Viterbi algorithm for identifying the path with minimal (resp. maximal) weight in a ntape weighted finitestate machine (nWFSM), that accepts a given ntuple of input strings 〈s1,... sn〉. It also allows us to compile the best transduction of a given input ntuple ..."
Abstract
 Add to MetaCart
We present a generalization of the Viterbi algorithm for identifying the path with minimal (resp. maximal) weight in a ntape weighted finitestate machine (nWFSM), that accepts a given ntuple of input strings 〈s1,... sn〉. It also allows us to compile the best transduction of a given input ntuple by a weighted (n+m)WFSM (transducer) with n input and m output tapes. Our algorithm has a worstcase time complexity of O ( s  n E  log s  n Q ), where n and s  are the number and average length of the strings in the ntuple, and Q  and E  the number of states and transitions in the nWFSM, respectively. A straight forward alternative, consisting in intersection followed by classical shortestdistance search, operates in O ( s  n (E  + Q) log s  n Q  ) time. 1