Results 1 - 10
of
27
The Measurement of Textual Coherence with Latent Semantic Analysis
, 1998
"... Latent Semantic Analysis is used as a technique for measuring the coherence of texts. By comparing the vectors for two adjoining segments of text in a highdimensional semantic space, the method provides a characterization of the degree of semantic relatedness between the segments. We illustrate the ..."
Abstract
-
Cited by 107 (8 self)
- Add to MetaCart
Latent Semantic Analysis is used as a technique for measuring the coherence of texts. By comparing the vectors for two adjoining segments of text in a highdimensional semantic space, the method provides a characterization of the degree of semantic relatedness between the segments. We illustrate the approach for predicting coherence through re-analyzing sets of texts from two studies that manipulated the coherence of texts and assessed readers' comprehension. The results indicate that the method is able to predict the effect of text coherence on comprehension and is more effective than simple term-term overlap measures. In this manner, LSA can be applied as an automated method that produces coherence predictions similar to propositional modeling. We describe additional studies investigating the application of LSA to analyzing discourse structure and examine the potential of LSA as a psychological model of coherence effects in text comprehension.
Stylistic Experiments For Information Retrieval
, 2000
"... Information retrieval systems are built to handle texts as topical items: texts are tabulated by occurrence frequencies of content words in them, under the assumption that text topic is reasonably well modeled by content word occurrence. But texts have several interesting characteristics beyond topi ..."
Abstract
-
Cited by 47 (8 self)
- Add to MetaCart
Information retrieval systems are built to handle texts as topical items: texts are tabulated by occurrence frequencies of content words in them, under the assumption that text topic is reasonably well modeled by content word occurrence. But texts have several interesting characteristics beyond topic. The experiments described in this text investigate stylistic variation. Roughly put, style is the difference between two ways of saying the same thing -- and systematic stylistic variation can be used to characterize the genre of documents. These experiments investigate if stylistic information is distinguishable using simple language engineering methods, and if in that case this type of information can be used to improve information retrieval systems.
Weight functions impact on LSA performance
- EuroConference RANLP'2001 (Recent Advances in NLP
, 2001
"... This paper presents experimental results of usage of LSA for analysis of English literature texts. Several preliminary transformations of the frequency text-document matrix with different weight functions are tested on the basis of control subsets. Additional clustering based on correlation matrix i ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
This paper presents experimental results of usage of LSA for analysis of English literature texts. Several preliminary transformations of the frequency text-document matrix with different weight functions are tested on the basis of control subsets. Additional clustering based on correlation matrix is applied in order to reveal the latent structure. The algorithm creates a shaded form matrix via singular values and vectors. The results are interpreted as a quality of the transformations and compared to the control set tests. 1.
The Analysis of Reading Tasks and Texts
- Aspects of Code-Switching in the Discourse of Bilingual Mexican-American Children
, 1977
"... which is/are unaval able. 12-1-84 CENTER FOR THE STUDY OF READING Technical Report No. 43 ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
which is/are unaval able. 12-1-84 CENTER FOR THE STUDY OF READING Technical Report No. 43
The Principles of Readability
- Costa Mesa, CA: Impact Information
, 2004
"... The principles of readability are in every style manual. Readability formulas are in every word processor. What is missing is the research and theory on which they stand. ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
The principles of readability are in every style manual. Readability formulas are in every word processor. What is missing is the research and theory on which they stand.
Producing More Readable Extracts by Revising Them.
, 2000
"... In this paper, we first experimentally investigated the factors that make extracts hard to read. We did this by having human subjects try to revise extracts to produce more readable ones. We then classified the factors into five, most of which are related to cohesion, after which we devised revision ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this paper, we first experimentally investigated the factors that make extracts hard to read. We did this by having human subjects try to revise extracts to produce more readable ones. We then classified the factors into five, most of which are related to cohesion, after which we devised revision rnlcs for each factor, and partially implemented a system that revises extracts.
Analyzing writing styles with Coh-Metrix
- In Proceedings of the Florida Artificial Intelligence Research Society International Conference (FLAIRS
, 2006
"... Computer scientists, linguists, stylometricians, and cognitive scientists have successfully divided corpora into modes, domains, genres, registers, and authors. The limitations for these successes, however, often result from insufficient indices with which their corpora are analyzed. In this paper, ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Computer scientists, linguists, stylometricians, and cognitive scientists have successfully divided corpora into modes, domains, genres, registers, and authors. The limitations for these successes, however, often result from insufficient indices with which their corpora are analyzed. In this paper, we use Coh-Metrix, a computational tool that analyzes text on over 200 indices of cohesion and difficulty. We demonstrate how, with the benefit of statistical analysis, texts can be analyzed for subtle, yet meaningful differences. In this paper, we report evidence that authors within the same register can be computationally distinguished despite evidence that stylistic markers can also shift significantly over time.
An Analysis of Statistical Models and Features for Reading Difficulty Prediction
"... A reading difficulty measure can be described as a function or model that maps a text to a numerical value corresponding to a difficulty or grade level. We describe a measure of readability that uses a combination of lexical features and grammatical features that are derived from subtrees of syntact ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
A reading difficulty measure can be described as a function or model that maps a text to a numerical value corresponding to a difficulty or grade level. We describe a measure of readability that uses a combination of lexical features and grammatical features that are derived from subtrees of syntactic parses. We also tested statistical models for nominal, ordinal, and interval scales of measurement. The results indicate that a model for ordinal regression, such as the proportional odds model, using a combination of grammatical and lexical features is most effective at predicting reading difficulty. 1
Readers' Comprehension And Strategies In Linear Text And Hypertext
, 1993
"... Hypertexts present methods to read online texts that are different from those available when reading standard linear texts. Hypertexts give readers more flexibility in choosing paths through the text and in finding relevant information. However, research in hypertext has often shown little or no adv ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Hypertexts present methods to read online texts that are different from those available when reading standard linear texts. Hypertexts give readers more flexibility in choosing paths through the text and in finding relevant information. However, research in hypertext has often shown little or no advantages over the equivalent linear text.
Information Retrieval for Education: Making Search Engines Language Aware. Themes in Science and Technology Education. Special issue on computer-aided language analysis, teaching and learning: Approaches, perspectives and applications 3(1–2), 9–30
, 2010
"... Search engines have been a major factor in making the web the successful and widely used information source it is today. Generally speaking, they make it possible to retrieve web pages on a topic specified by the keywords entered by the user. Yet web searching currently does not take into account wh ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Search engines have been a major factor in making the web the successful and widely used information source it is today. Generally speaking, they make it possible to retrieve web pages on a topic specified by the keywords entered by the user. Yet web searching currently does not take into account which of the search results are comprehensible for a given user – an issue of particular relevance when considering students in an educational setting. And current search engines do not support teachers in searching for language properties relevant for selecting texts appropriate for language students at different stages in the second language acquisition process. At the same time, raising language awareness is a major focus in second language acquisition research and foreign language teaching practice, and research since the 20s has tried to identify indicators predicting which texts are comprehensible for readers at a particular level of ability. For example, the military has been interested in ensuring that workers at a given level of education can understand the manuals they need to read in order to perform their job. We present a new search engine approach which makes it possible for teachers to search for texts both in terms of contents and in terms of their reading difficulty and other language properties. The implemented prototype builds on state-of-theart information retrieval technology and exemplifies how a range of readability measures can be integrated in a modular fashion. 1

