Results 1 -
4 of
4
Visualizing High-Dimensional Structures by Dimension Ordering and Filtering using Subspace Analysis
"... High-dimensional data visualization is receiving increasing interest because of the growing abundance of highdimensional datasets. To understand such datasets, visualization of the structures present in the data, such as clusters, can be an invaluable tool. Structures may be present in the full high ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
High-dimensional data visualization is receiving increasing interest because of the growing abundance of highdimensional datasets. To understand such datasets, visualization of the structures present in the data, such as clusters, can be an invaluable tool. Structures may be present in the full high-dimensional space, as well as in its subspaces. Two widely used methods to visualize high-dimensional data are the scatter plot matrix (SPM) and the parallel coordinate plot (PCP). SPM allows a quick overview of the structures present in pairwise combinations of dimensions. On the other hand, PCP has the potential to visualize not only bi-dimensional structures but also higher dimensional ones. A problem with SPM is that it suffers from crowding and clutter which makes interpretation hard. Approaches to reduce clutter are available in the literature, based on changing the order of the dimensions. However, usually this reordering has a high computational complexity. For effective visualization of high-dimensional structures, also PCP requires a proper ordering of the dimensions. In this paper, we propose methods for reordering dimensions in PCP in such a way that high-dimensional structures (if present) become easier to perceive. We also present a method for dimension reordering in SPM which yields results that are comparable to those of existing approaches, but at a much lower computational cost. Our approach is based on finding relevant subspaces for clustering using a quality criterion and cluster information.
Perception-based Visual Quality Measures Georgia Albuquerque ∗
"... In recent years diverse quality measures to support the exploration of high-dimensional data sets have been proposed. Such measures can be very useful to rank and select information-bearing projections of very high dimensional data, when the visual exploration of all possible projections becomes unf ..."
Abstract
- Add to MetaCart
In recent years diverse quality measures to support the exploration of high-dimensional data sets have been proposed. Such measures can be very useful to rank and select information-bearing projections of very high dimensional data, when the visual exploration of all possible projections becomes unfeasible. But even though a ranking of the low dimensional projections may support the user in the visual exploration task, different measures deliver different distances between the views that do not necessarily match the expectations of human perception. As an alternative solution, we propose a perception-based approach that, similar to the existing measures, can be used to select information bearing projections of the data. Specifically, we construct a perceptual embedding for the different projections based on the data from a psychophysics study and multi-dimensional scaling. This embedding together with a ranking function is then used to estimate the value of the projections for a specific user task in a perceptual sense.
Synthetic Generation of High-Dimensional Datasets
"... Abstract—Generation of synthetic datasets is a common practice in many research areas. Such data is often generated to meet specific needs or certain conditions that may not be easily found in the original, real data. The nature of the data varies according to the application area and includes text, ..."
Abstract
- Add to MetaCart
Abstract—Generation of synthetic datasets is a common practice in many research areas. Such data is often generated to meet specific needs or certain conditions that may not be easily found in the original, real data. The nature of the data varies according to the application area and includes text, graphs, social or weather data, among many others. The common process to create such synthetic datasets is to implement small scripts or programs, restricted to small problems or to a specific application. In this paper we propose a framework designed to generate high dimensional datasets. Users can interactively create and navigate through multi dimensional datasets using a suitable graphical user-interface. The data creation is driven by statistical distributions based on a few user-defined parameters. First, a grounding dataset is created according to given inputs, and then structures and trends are included in selected dimensions and orthogonal projection planes. Furthermore, our framework supports the creation of complex non-orthogonal trends and classified datasets. It can successfully be used to create synthetic datasets simulating important trends as multidimensional clusters, correlations and outliers. Index Terms—Synthetic data generation, multivariate data, high-dimensional data, interaction. 1
Selecting Coherent and Relevant Plots in Large Scatterplot Matrices
, 2012
"... The scatterplot matrix (SPLOM) is a well-established technique to visually explore high-dimensional data sets. It is characterized by the number of scatterplots (plots) of which it consists of. Unfortunately, this number quadratically grows with the number of the data set’s dimensions. Thus, a SPLOM ..."
Abstract
- Add to MetaCart
The scatterplot matrix (SPLOM) is a well-established technique to visually explore high-dimensional data sets. It is characterized by the number of scatterplots (plots) of which it consists of. Unfortunately, this number quadratically grows with the number of the data set’s dimensions. Thus, a SPLOM scales very poorly. Consequently, the usefulness of SPLOMs is restricted to a small number of dimensions. For this, several approaches already exist to explore such “small ” SPLOMs. Those approaches address the scalability problem just indirectly and without solving it. Therefore, we introduce a new greedy approach to manage “large ” SPLOMs with more than hundred dimensions. We establish a combined visualization and interaction scheme that produces intuitively interpretable SPLOMs by combining known quality measures, a pre-process reordering, and a perception-based abstraction. With this scheme, the user can interactively find large amounts of relevant plots in large SPLOMs.

