Results 1 -
3 of
3
Advances in Domain Independent Linear Text Segmentation
, 2000
"... This paper describes a method for linear text seg- mc. ntation which is twice as accurate and over seven times as fast as the state-of-the-art (Reynar, 1998). Inter-sentence similarity is replaced by rank in the local context. Boundary locations are discovered by divisive clustering. ..."
Abstract
-
Cited by 100 (1 self)
- Add to MetaCart
This paper describes a method for linear text seg- mc. ntation which is twice as accurate and over seven times as fast as the state-of-the-art (Reynar, 1998). Inter-sentence similarity is replaced by rank in the local context. Boundary locations are discovered by divisive clustering.
Statistical Models for Text Segmentation
- Machine Learning
, 1999
"... . This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The mod ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
. This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The models use two classes of features: topicality features that use adaptive language models in a novel way to detect broad changes of topic, and cue-word features that detect occurrences of specific words, whichmay be domain-specific, that tend to be used near segment boundaries. Assessment of our approachonquantitative and qualitative grounds demonstrates its effectiveness in twovery different domains, Wall Street Journal news articles and television broadcast news story transcripts. Quantitative results on these domains are presented using a new probabilistically motivated error metric, whichcombines precision and recall in a natural and flexible way. This metric is used to make a quantitative ...

