Results 1 -
4 of
4
UMass at tdt 2004
- In Working Notes of the TDT-2004 Evaluation
, 2004
"... submitted runs for all four tasks, namely, Hierarchical Topic Detection, ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
submitted runs for all four tasks, namely, Hierarchical Topic Detection,
An Information-Theoretic Approach to Automatic Evaluation of Summaries
"... Until recently there are no common, convenient, and repeatable evaluation methods that could be easily applied to support fast turn-around development of automatic text summarization systems. In this paper, we introduce an informationtheoretic approach to automatic evaluation of summaries based on t ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Until recently there are no common, convenient, and repeatable evaluation methods that could be easily applied to support fast turn-around development of automatic text summarization systems. In this paper, we introduce an informationtheoretic approach to automatic evaluation of summaries based on the Jensen-Shannon divergence of distributions between an automatic summary and a set of reference summaries. Several variants of the approach are also considered and compared. The results indicate that JS divergencebased evaluation method achieves comparable performance with the common automatic evaluation method ROUGE in single documents summarization task; while achieves better performance than ROUGE in multiple document summarization task. 1
ARBEITEN ZUR MEHRSPRACHIGKEIT WORKING PAPERS IN MULTILINGUALISM
"... Combining various text analysis tools for multilingual media monitoring ..."
An Arabic Lemma-Based Stemmer for Latent Topic Modeling IAJIT First Online Publication
, 2010
"... Abstract: Developments in Arabic information retrieval did not follow the increasing use of the Arabic Web during the last decade. Semantic indexing in a language with high inflectional morphology, such as Arabic, is not a trivial task and requires a text analysis in the original language. Excepting ..."
Abstract
- Add to MetaCart
Abstract: Developments in Arabic information retrieval did not follow the increasing use of the Arabic Web during the last decade. Semantic indexing in a language with high inflectional morphology, such as Arabic, is not a trivial task and requires a text analysis in the original language. Excepting cross-language retrieval methods or limited studies, the main efforts, for developing semantic analysis methods and topic modeling, did not include Arabic text. This paper describes our approach for analyzing semantics in Arabic texts. A new lemma-based stemmer is developed and compared to root-based one for characterizing Arabic text. The Latent Dirichlet Allocation (LDA) model is adapted to extract Arabic latent topics from various real-world corpora. In addition to the interesting subjects discovered in the press articles during the 2007-2009 period, experiments show that the classification performances with lemma-based stemming in the topics space, are improved when comparing to classification with root-based stemming.

