Results 1 -
5 of
5
Semisupervised learning of hierarchical latent trait models for data visualisation
- IEEE Transactions on Knowledge and Data Engineering
, 2005
"... Recently, we have developed the hierarchical Generative Topographic Mapping (HGTM), an inter-active method for visualisation of large high-dimensional real-valued data sets. In this paper, we propose a more general visualisation system by extending HGTM in 3 ways, which allow the user to visualise a ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Recently, we have developed the hierarchical Generative Topographic Mapping (HGTM), an inter-active method for visualisation of large high-dimensional real-valued data sets. In this paper, we propose a more general visualisation system by extending HGTM in 3 ways, which allow the user to visualise a wider range of datasets and better support the model development process. (i) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM). This enables us to visualise data of inherently discrete nature, e.g. collections of documents in a hierarchical manner. (ii) We give the user a choice of initialising the child plots of the current plot in either interactive, or automatic mode. In the interactive mode the user selects “regions of interest”, whereas in the automatic mode an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualising large data sets. (iii) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualisation plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets.
Visual mining of powersets with large alphabets
- University of British Columbia
, 2006
"... We present the PowerSetViewer visualization system for the lattice-based mining of powersets. Searching for items within the powerset of a universe occurs in many large dataset knowledge discovery contexts. Using a spatial layout based on a powerset provides a unified visual framework at three diffe ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present the PowerSetViewer visualization system for the lattice-based mining of powersets. Searching for items within the powerset of a universe occurs in many large dataset knowledge discovery contexts. Using a spatial layout based on a powerset provides a unified visual framework at three different levels: data mining on the filtered dataset, browsing the entire dataset, and comparing multiple datasets sharing the same alphabet. The features of our system allow users to find appropriate parameter settings for data mining algorithms through lightweight visual experimentation showing partial results. We use dynamic constrained frequent set mining as a con-crete case study to showcase the utility of the system. The key challenge for spatial layouts based on powerset structure is handling large alphabets, because the size of the powerset grows exponentially with the size of the alphabet. We present scalable algorithms for enumerating and displaying datasets containing between 1.5 and 7 million itemsets, and alphabet sizes of over 40,000. ii
One Dimensional Layout Optimization, with Applications to Graph Drawing by Axis Separation,” Computational Geometry: Theory and Applications
"... Abstract. In this paper we discuss a useful family of graph drawing algorithms, characterized by their ability to draw graphs in one dimension. We define the spe-cial requirements from such algorithms and show how several graph drawing tech-niques can be extended to handle this task. In particular, ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. In this paper we discuss a useful family of graph drawing algorithms, characterized by their ability to draw graphs in one dimension. We define the spe-cial requirements from such algorithms and show how several graph drawing tech-niques can be extended to handle this task. In particular, we suggest a novel op-timization algorithm that facilitates using the Kamada and Kawai model [17] for producing one-dimensional layouts. The most important application of the algo-rithms seems to be in achieving graph drawing by axis separation, where each axis of the drawing addresses different aspects of aesthetics.
FpVAT: A Visual Analytic Tool for Supporting Frequent Pattern Mining ABSTRACT
"... As frequent pattern mining plays an essential role in many knowledge discovery and data mining (KDD) tasks, numerous algorithms for finding frequent patterns have been proposed over the past 15 years. However, most of these algorithms return the mining results in the form of textual lists containing ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
As frequent pattern mining plays an essential role in many knowledge discovery and data mining (KDD) tasks, numerous algorithms for finding frequent patterns have been proposed over the past 15 years. However, most of these algorithms return the mining results in the form of textual lists containing frequent patterns showing those frequently occurring sets of items. It is well known that “a picture is worth a thousand words”. The use of visual representation can enhance the user’s understanding of the inherent relations in a collection of frequent patterns. In this paper, we develop a simple yet useful visual analytic tool for supporting frequent pattern mining called FpVAT. Such a visual analytic tool consists of two modules: One module gives users an overview so that they can derive insight from a massive amount of raw data; another module enables users to perform analytical reasoning on the mining results via interactive visual interfaces so that users can detect the expected frequent patterns and discover the unexpected frequent patterns. As a visual analytic tool, our FpVAT is equipped with several interactive features for effective visual support in the data analysis and KDD process for various real-life applications. 1.
A Two-Way Visualization Method for Clustered Data (Extended Abstract)
"... Yehuda Koren and David Harel Dept. of Computer Science and Applied Mathematics The Weizmann Institute of Science, Rehovot, Israel {yehuda,dharel}@wisdom.weizmann.ac. il ABSTRACT We describe a novel approach to the visualization of hierarchical clustering that superimposes the classical dendrogr ..."
Abstract
- Add to MetaCart
Yehuda Koren and David Harel Dept. of Computer Science and Applied Mathematics The Weizmann Institute of Science, Rehovot, Israel {yehuda,dharel}@wisdom.weizmann.ac. il ABSTRACT We describe a novel approach to the visualization of hierarchical clustering that superimposes the classical dendrogram over a fully synchronized low-dimensional embedding, thereby gaining the benefits of both approaches. In a single image one can view all the clusters, examine the relations between them and study many of their properties. The method is based on an algorithm for lowdimensional embedding of clustered data, with the property that separation between all clusters is guaranteed, regardless of their nature. In particular, the algorithm was designed to produce embeddings that strictly adhere to a given hierarchical clustering of the data, so that every two disjoint clusters in the hierarchy are drawn separately.

