Results 1 -
4 of
4
Lightweight Collaborative Filtering Method for Binary Encoded Data
- Proceedings of the Fifth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD
, 2001
"... A lightweight method for collaborative filtering is described that processes binary encoded data. Examples of transactions that can be described in this manner are items purchased by customers or web pages visited by individuals. As with all collaborative filtering, the objective is to match a perso ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A lightweight method for collaborative filtering is described that processes binary encoded data. Examples of transactions that can be described in this manner are items purchased by customers or web pages visited by individuals. As with all collaborative filtering, the objective is to match a person's records to customers with similar records. For example, based on prior purchases of a customer, one might recommend new items for purchase by examining stored records of other customers who made similar purchases. Because the data are binary (true-or-false) encoded, and not ranked preferences on a numerical scale, efficient and lightweight schemes are described for compactly storing data, computing similarities between new and stored records, and making recommendations tailored to an individual.
Lightweight Document Clustering
, 2000
"... Alightweight document clustering method is described that operates in high dimensions, processes tens of thousands of documents and groups them into several thousand clusters, or byvarying a single parameter, into a few dozen clusters. The method uses a reduced indexing view of the original docum ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Alightweight document clustering method is described that operates in high dimensions, processes tens of thousands of documents and groups them into several thousand clusters, or byvarying a single parameter, into a few dozen clusters. The method uses a reduced indexing view of the original documents, where only the k best keywords of each document are indexed. An efficient procedure for clustering is specified in two parts (a) compute k most similar documents for each document in the collection and (b) group the documents into clusters using these similarity scores. The method has been evaluated on a database of over 50,000 customer service problem reports that are reduced to 3,000 clusters and 5,000 exemplar documents. Results demonstrate efficient clustering performance with excellent group similarity measures. Keywords: text clustering, structuring information to aid searchandnavigation. automated presentation of information, text data mining 1 Introduction The objecti...
Automated Generation of Model Cases for Help-Desk Applications
- Research Report RC-22061, IBM Thomas J. Watson Research
, 2001
"... . Document databases may beill-' containingredundant and poorly organized documents. For example, a database of customers ' descriptions of problems with products and the vendor's descriptions of their resolution may contain many descriptions of the sameproblem. A highly desirable goal is to ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
. Document databases may beill-' containingredundant and poorly organized documents. For example, a database of customers ' descriptions of problems with products and the vendor's descriptions of their resolution may contain many descriptions of the sameproblem. A highly desirable goal is to transform the database into a concise set of summarized reports, model cases, which in turn are more amenable to search and problem resolution without expert intervention. In this paper, we describe techniques that try to automate the procedures for reducing a database to its essential components. Our initial application is self-' for resolution of product problems. A lightweight document clustering method is described that operates in high dimensions, processes tens of thousands of documents and groups them into several thousand clusters. Techniques for summarization and exemplar selection are described to further refine the database contents. The method has been evaluated on a database of over 100,000 customer service problem reports that are reduced to 3,000 clusters and 5,000 exemplar documents. Preliminary results are promising and demonstrate efficient clustering performance with excellent group similarity measures, reducing the original database size by several orders of magnitude. 1
Automated Generation
"... this paper, we describe techniques for attempting to automate the procedures for reducing a database to its essential components. Our initial application is self help for resolution of product problems. A lightweight document clustering method is described that operates in high dimensionality ..."
Abstract
- Add to MetaCart
this paper, we describe techniques for attempting to automate the procedures for reducing a database to its essential components. Our initial application is self help for resolution of product problems. A lightweight document clustering method is described that operates in high dimensionality, processing tens of thousands of documents and grouping them into several thousand clusters. Techniques are described for summarization and exemplar selection to further refine the database contents. The method has been evaluated on a database of over 100 000 customer-service problem reports that are reduced to 3000 clusters and 5000 exemplar documents. Preliminary results are promising and demonstrate efficient clustering performance with excellent group similarity measures, reducing the original database size by several orders of magnitude

