Measuring Historical Word Sense Variation
| Citations: | 2 - 1 self |
BibTeX
@MISC{Bamman_measuringhistorical,
author = {David Bamman and Gregory Crane},
title = {Measuring Historical Word Sense Variation},
year = {}
}
OpenURL
Abstract
We describe here a method for automatically identifying word sense variation in a dated collection of historical books in a large digital library. By leveraging a small set of known translation book pairs to induce a bilingual sense inventory and labeled training data for a WSD classifier, we are able to automatically classify the Latin word senses in a 389 million word corpus and track the rise and fall of those senses over a span of two thousand years. We evaluate the performance of seven different classifiers both in a tenfold test on 83,892 words from the aligned parallel corpus and on a smaller, manually annotated sample of 525 words, measuring both the overall accuracy of each system and how well that accuracy correlates (via mean square error) to the observed historical variation.







