• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,620
Next 10 →

Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results

by Marti A. Hearst, Jan O. Pedersen , 1996
"... We present Scatter/Gather, a cluster-based document browsing method, as an alternative to ranked titles for the organization and viewing of retrieval results. We systematically evaluate Scatter/Gather in this context and find significant improvements over similarity search ranking alone. This resul ..."
Abstract - Cited by 480 (5 self) - Add to MetaCart
We present Scatter/Gather, a cluster-based document browsing method, as an alternative to ranked titles for the organization and viewing of retrieval results. We systematically evaluate Scatter/Gather in this context and find significant improvements over similarity search ranking alone

Survey of clustering data mining techniques

by Pavel Berkhin , 2002
"... Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in math ..."
Abstract - Cited by 408 (0 self) - Add to MetaCart
applications such as scientific data exploration, information retrieval and text mining, spatial database applications, Web analysis, CRM, marketing, medical diagnostics, computational biology, and many others. Clustering is the subject of active research in several fields such as statistics, pattern

Blobworld: A System for Region-Based Image Indexing and Retrieval

by Chad Carson, Megan Thomas, Serge Belongie, Joseph M. Hellerstein, Jitendra Malik - In Third International Conference on Visual Information Systems , 1999
"... . Blobworld is a system for image retrieval based on finding coherent image regions which roughly correspond to objects. Each image is automatically segmented into regions ("blobs") with associated color and texture descriptors. Querying is based on the attributes of one or two regions of ..."
Abstract - Cited by 375 (4 self) - Add to MetaCart
. Blobworld is a system for image retrieval based on finding coherent image regions which roughly correspond to objects. Each image is automatically segmented into regions ("blobs") with associated color and texture descriptors. Querying is based on the attributes of one or two regions

Aggregating inconsistent information: ranking and clustering

by Nir Ailon, Moses Charikar, Alantha Newman , 2005
"... We address optimization problems in which we are given contradictory pieces of input information and the goal is to find a globally consistent solution that minimizes the number of disagreements with the respective inputs. Specifically, the problems we address are rank aggregation, the feedback arc ..."
Abstract - Cited by 226 (17 self) - Add to MetaCart
We address optimization problems in which we are given contradictory pieces of input information and the goal is to find a globally consistent solution that minimizes the number of disagreements with the respective inputs. Specifically, the problems we address are rank aggregation, the feedback arc

Data Clustering: 50 Years Beyond K-Means

by Anil K. Jain , 2008
"... Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and m ..."
Abstract - Cited by 294 (7 self) - Add to MetaCart
Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms

The Universal Protein Resource (UniProt): an expanding universe of protein information

by Cathy H. Wu, Rolf Apweiler, Amos Bairoch, Darren A. Natale, Winona C. Barker, Brigitte Boeckmann, Serenella Ferro, Elisabeth Gasteiger, Hongzhan Huang, Rodrigo Lopez, Michele Magrane, Maria J. Martin, Raja Mazumder, Nicole Redaschi, Baris Suzek - Nucleic Acids Res , 2006
"... The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), compris-ing the manually annotated UniProtKB/Swiss-Prot sec ..."
Abstract - Cited by 302 (20 self) - Add to MetaCart
ProtKB/Swiss-Prot section and the automatically annotated UniProtKB/ TrEMBL section, is the preeminent storehouse of pro-tein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to anal-yse proteins and query across databases. The Uni

Learning to cluster web search results

by Gaojie He, Co-supervisor Robert Neumayer, Gaojie He, Robert Neumayer, Kjetil Norvag - In Proc. of SIGIR ’04 , 2004
"... In web search, surfers are often faced with the problem of selecting their most wanted information from the potential huge amount of search results. The clustering of web search results is the possible solution, but the traditional content based clustering is not sufficient since it ignores many uni ..."
Abstract - Cited by 195 (7 self) - Add to MetaCart
of this project is to integrate the authoritative information such as PageRank, link structure (e.g. in-links and out-links) into the K-Means clustering of web search results. The PageRank, inlinks and out-links can be used to extend the vector representation of web pages, and the PageRank can also be considered

Multi-way clustering on relation graphs

by Arindam Banerjee, Sugato Basu, Srujana Merugu - In Proc. of the 7th SIAM Intl. Conf. on Data Mining , 2006
"... A number of real-world domains such as social networks and e-commerce involve heterogeneous data that describes relations between multiple classes of entities. Understanding the natural structure of this type of heterogeneous relational data is essential both for exploratory analysis and for perform ..."
Abstract - Cited by 36 (3 self) - Add to MetaCart
data. We accomplish the above generalizations by extending a recently proposed key theoretical result, namely the minimum Bregman information principle [1], to the relation graph setting. We also describe an efficient multi-way clustering algorithm based on alternate minimization that generalizes a

Ranking-based clustering of heterogeneous information networks with star network schema

by Yizhou Sun, Yintao Yu, Jiawei Han - In: Proc. 2009 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2009 , 2009
"... A heterogeneous information network is an information network composed of multiple types of objects. Clustering on such a network may lead to better understanding of both hidden structures of the network and the individual role played by every object in each cluster. However, although clustering on ..."
Abstract - Cited by 85 (30 self) - Add to MetaCart
and the recently proposed algorithm, RankClus. Further, NetClus generates informative clusters, presenting good ranking and cluster membership information for each attribute object in each net-cluster.

Efficient IR-Style Keyword Search over Relational Databases

by Vagelis Hristidis, Luis Gravano, Yannis Papakonstantinou - In VLDB , 2003
"... Applications in which plain text coexists with structured data are pervasive. Commercial relational database management systems (RDBMSs) generally provide querying capabilities for text attributes that incorporate state-of-the-art information retrieval (IR) relevance ranking strategies, but this sea ..."
Abstract - Cited by 211 (10 self) - Add to MetaCart
Applications in which plain text coexists with structured data are pervasive. Commercial relational database management systems (RDBMSs) generally provide querying capabilities for text attributes that incorporate state-of-the-art information retrieval (IR) relevance ranking strategies
Next 10 →
Results 1 - 10 of 2,620
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2016 The Pennsylvania State University