Results 1 - 10
of
1,124,800
Managing Gigabytes: Compressing and Indexing Documents and Images - Errata
, 1996
"... > ! "GZip" page 64, Table 2.5, line "progp": "43,379" ! "49,379" page 68, Table 2.6: "Mbyte/sec" ! "Mbyte/min" twice in the body of the table, and in the caption "Mbyte/second" ! "Mbyte/minute" page 70, para 4, line ..."
Abstract
-
Cited by 985 (48 self)
- Add to MetaCart
> ! "GZip" page 64, Table 2.5, line "progp": "43,379" ! "49,379" page 68, Table 2.6: "Mbyte/sec" ! "Mbyte/min" twice in the body of the table, and in the caption "Mbyte/second" ! "Mbyte/minute" page 70, para 4, line 5: "Santos" ! "Santis" page 71, line 11: "Fiala and Greene (1989)" ! "Fiala and Green (1989)" Chapter Three page 89, para starting "Using this method", line 2: "hapax legomena " ! "hapax legomenon " page 96, line 5: "a such a" ! "such a" page 98, line 6: "shows that in fact none is an answer to this query" ! "shows that only document 2 is an answer to this query" page 106, para 3, line 9: "the bitstring in Figure 3.7b" ! "the bitstring in Figure 3.7c" page 107, Figure 3.7: The coding shown in part (c) cannot be decoded ambiguously. For example, the sequence "1010 0000 0001 0000
Machine Learning in Automated Text Categorization
- ACM COMPUTING SURVEYS
, 2002
"... The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this p ..."
Abstract
-
Cited by 1658 (22 self)
- Add to MetaCart
The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach
Parallel Networks that Learn to Pronounce English Text
- COMPLEX SYSTEMS
, 1987
"... This paper describes NETtalk, a class of massively-parallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed h ..."
Abstract
-
Cited by 548 (5 self)
- Add to MetaCart
This paper describes NETtalk, a class of massively-parallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed
The JPEG still picture compression standard
- Communications of the ACM
, 1991
"... This paper is a revised version of an article by the same title and author which appeared in the April 1991 issue of Communications of the ACM. For the past few years, a joint ISO/CCITT committee known as JPEG (Joint Photographic Experts Group) has been working to establish the first international c ..."
Abstract
-
Cited by 1128 (0 self)
- Add to MetaCart
compression standard for continuous-tone still images, both grayscale and color. JPEG’s proposed standard aims to be generic, to support a wide variety of applications for continuous-tone images. To meet the differing needs of many applications, the JPEG standard includes two basic compression methods, each
An evaluation of statistical approaches to text categorization
- Journal of Information Retrieval
, 1999
"... Abstract. This paper focuses on a comparative evaluation of a wide-range of text categorization methods, including previously published results on the Reuters corpus and new results of additional experiments. A controlled study using three classifiers, kNN, LLSF and WORD, was conducted to examine th ..."
Abstract
-
Cited by 664 (23 self)
- Add to MetaCart
Abstract. This paper focuses on a comparative evaluation of a wide-range of text categorization methods, including previously published results on the Reuters corpus and new results of additional experiments. A controlled study using three classifiers, kNN, LLSF and WORD, was conducted to examine
Inductive Learning Algorithms and Representations for Text Categorization
, 1998
"... Text categorization – the assignment of natural language texts to one or more predefined categories based on their content – is an important component in many information organization and management tasks. We compare the effectiveness of five different automatic learning algorithms for text categori ..."
Abstract
-
Cited by 641 (8 self)
- Add to MetaCart
Text categorization – the assignment of natural language texts to one or more predefined categories based on their content – is an important component in many information organization and management tasks. We compare the effectiveness of five different automatic learning algorithms for text
Toward a model of text comprehension and production
- Psychological Review
, 1978
"... The semantic structure of texts can be described both at the local microlevel and at a more global macrolevel. A model for text comprehension based on this notion accounts for the formation of a coherent semantic text base in terms of a cyclical process constrained by limitations of working memory. ..."
Abstract
-
Cited by 540 (12 self)
- Add to MetaCart
The semantic structure of texts can be described both at the local microlevel and at a more global macrolevel. A model for text comprehension based on this notion accounts for the formation of a coherent semantic text base in terms of a cyclical process constrained by limitations of working memory
A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge
- PSYCHOLOGICAL REVIEW
, 1997
"... How do people know as much as they do with as little information as they get? The problem takes many forms; learning vocabulary from text is an especially dramatic and convenient case for research. A new general theory of acquired similarity and knowledge representation, latent semantic analysis (LS ..."
Abstract
-
Cited by 1772 (10 self)
- Add to MetaCart
How do people know as much as they do with as little information as they get? The problem takes many forms; learning vocabulary from text is an especially dramatic and convenient case for research. A new general theory of acquired similarity and knowledge representation, latent semantic analysis
Extracting Relations from Large Plain-Text Collections
, 2000
"... Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering precise queries or for running data mining tasks. We explore a technique for extracting such tables fr ..."
Abstract
-
Cited by 480 (25 self)
- Add to MetaCart
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering precise queries or for running data mining tasks. We explore a technique for extracting such tables
BoosTexter: A Boosting-based System for Text Categorization
"... This work focuses on algorithms which learn from examples to perform multiclass text and speech categorization tasks. Our approach is based on a new and improved family of boosting algorithms. We describe in detail an implementation, called BoosTexter, of the new boosting algorithms for text catego ..."
Abstract
-
Cited by 658 (20 self)
- Add to MetaCart
This work focuses on algorithms which learn from examples to perform multiclass text and speech categorization tasks. Our approach is based on a new and improved family of boosting algorithms. We describe in detail an implementation, called BoosTexter, of the new boosting algorithms for text
Results 1 - 10
of
1,124,800