Results 1 - 10
of
47
An event-based framework for characterizing the evolution of interaction graphs
, 2007
"... Interaction graphs are ubiquitous in many fields such as bioinformatics, sociology and physical sciences. There have been many studies in the literature targeted at studying and mining these graphs. However, almost all of them have studied these graphs from a static point of view. The study of the e ..."
Abstract
-
Cited by 95 (3 self)
- Add to MetaCart
(Show Context)
Interaction graphs are ubiquitous in many fields such as bioinformatics, sociology and physical sciences. There have been many studies in the literature targeted at studying and mining these graphs. However, almost all of them have studied these graphs from a static point of view. The study of the evolution of these graphs over time can provide tremendous insight on the behavior of entities, communities and the flow of information among them. In this work, we present an event-based characterization of critical behavioral patterns for temporally varying interaction graphs. We use non-overlapping snapshots of interaction graphs and develop a framework for capturing and identifying interesting events from them. We use these events to characterize complex behavioral patterns of individuals and communities over time. We show how semantic information can be incorporated to reason about community-behavior events. We also demonstrate the application of behavioral patterns for the purposes of modeling evolution, link prediction and influence maximization. Finally, we present a diffusion model for evolving networks, based on our framework.
CSV: Visualizing and Mining Cohesive Subgraphs
"... Extracting dense sub-components from graphs efficiently is an important objective in a wide range of application domains ranging from social network analysis to biological network analysis, from the World Wide Web to stock market analysis. Motivated by this need recently we have seen several new alg ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
(Show Context)
Extracting dense sub-components from graphs efficiently is an important objective in a wide range of application domains ranging from social network analysis to biological network analysis, from the World Wide Web to stock market analysis. Motivated by this need recently we have seen several new algorithms to tackle this problem based on the (frequent) pattern mining paradigm. A limitation of most of these methods is that they are highly sensitive to parameter settings, rely on exhaustive enumeration with exponential time complexity, and often fail to help the users understand the underlying distribution of components embedded within the host graph. In this article we propose an approximate algorithm, to mine and visualize cohesive subgraphs (dense sub components) within a large graph. The approach, refereed to as Cohesive Subgraph Visualization (CSV) relies on a novel mapping strategy that maps edges and nodes to a multidimensional space wherein dense areas in the mapped space correspond to cohesive subgraphs. The algorithm then walks through the dense regions in the mapped space to output a visual plot that effectively captures the overall dense subcomponent distribution of the graph. Unlike extant algorithms with exponential complexity, CSV has a complexity of O(V 2 logV) when fixing the parameter mapping dimension, where V corresponds to the number of vertices in the graph, although for many real datasets the performance is typically sub-quadratic. We demonstrate the utility of CSV as a stand-alone tool for visual graph exploration and as a pre-filtering step to significantly scale up exact subgraph mining algorithms such
Recent advances in clustering methods for protein interaction networks
- BMC genomics, 11(Suppl 3):S10
, 2010
"... Recent advances in clustering methods for protein interaction networks ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
(Show Context)
Recent advances in clustering methods for protein interaction networks
Complex Networks as Unified Framework for Descriptive Analysis and Predictive Modeling
"... Abstract: The analysis of climate data has relied heavily on hypothesis-driven statistical methods, while projections of future climate are based primarily on physics-based computational models. However, in recent years a wealth of new datasets has become available. Therefore, we take a more data-ce ..."
Abstract
-
Cited by 12 (8 self)
- Add to MetaCart
(Show Context)
Abstract: The analysis of climate data has relied heavily on hypothesis-driven statistical methods, while projections of future climate are based primarily on physics-based computational models. However, in recent years a wealth of new datasets has become available. Therefore, we take a more data-centric approach and propose a unified framework for studying climate, with an aim toward characterizing observed phenomena as well as discovering new knowledge in climate science. Specifically, we posit that complex networks are well suited for both descriptive analysis and predictive modeling tasks. We show that the structural properties of ‘climate networks ’ have useful interpretation within the domain. Further, we extract clusters from these networks and demonstrate their predictive power as climate indices. Our experimental results establish that the network clusters are statistically significantly better predictors than clusters derived using a more traditional clustering approach. Using complex networks as data representation thus enables the unique opportunity for descriptive and predictive modeling to inform each
An Exploration of Climate Data Using Complex Networks
"... To discover patterns in historical data, climate scientists have applied various clustering methods with the goal of identifying regions that share some common climatological behavior. However, past approaches are limited by the fact that they either consider only a single time period (snapshot) of ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
(Show Context)
To discover patterns in historical data, climate scientists have applied various clustering methods with the goal of identifying regions that share some common climatological behavior. However, past approaches are limited by the fact that they either consider only a single time period (snapshot) of multivariate data, or they consider only a single variable by using the time series data as multi-dimensional feature vector. In both cases, potentially useful information may be lost. Moreover, clusters in high-dimensional data space can be difficult to interpret, prompting the need for a more effective data representation. We address both of these issues by employing a complex network (graph) to represent climate data, a more intuitive model that can be used for analysis while also having a direct mapping to the physical world for interpretation. A cross correlation function is used to weight network edges, thus respecting the temporal nature of the data, and a community detection algorithm identifies multivariate clusters. Examining networks for consecutive periods allows us to study structural changes over time. We show that communities have a climatological interpretation and that disturbances in structure can be an indicator of climate events (or lack thereof). Finally, we discuss how this model can be applied for the discovery of more complex concepts such as unknown teleconnections or the development of multivariate climate indices and predictive insights.
Community Detection in a Large Real-World Social Network
"... Abstract Identifying meaningful community structure in social networks is a hard problem, and extreme network size or sparseness of the network compound the difficulty of the task. With a proliferation of real-world network datasets there has been an increasing demand for algorithms that work effect ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
Abstract Identifying meaningful community structure in social networks is a hard problem, and extreme network size or sparseness of the network compound the difficulty of the task. With a proliferation of real-world network datasets there has been an increasing demand for algorithms that work effectively and efficiently. Existing methods are limited by their computational requirements and rely heavily on the network topology, which fails in scale-free networks. Yet, in addition to the network connectivity, many datasets also include attributes of individual nodes, but current methods are unable to incorporate this data. Cognizant of these requirements we propose a simple approach that stirs away from complex algorithms, focusing instead on the edge weights; more specifically, we leverage the node attributes to compute better weights. Our experimental results on a real-world social network show that a simple thresholding method with edge weights based on node attributes is sufficient to identify a very strong community structure. 1
Multi-functional Protein Clustering in PPI Networks
"... Abstract. Protein-Protein Interaction (PPI) networks contain valuable information for the isolation of groups of proteins that participate in the same biological function. Many proteins play different roles in the cell by taking part in several processes, but isolating the different processes in whi ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
Abstract. Protein-Protein Interaction (PPI) networks contain valuable information for the isolation of groups of proteins that participate in the same biological function. Many proteins play different roles in the cell by taking part in several processes, but isolating the different processes in which a protein is involved is often a difficult task. In this paper we present a method based on a greedy local search technique to detect functional modules in PPI graphs. The approach is conceived as a generalization of the algorithm PINCoC to generate overlapping clusters of the interaction graph in input. Due to this peculiarity, multi-facets proteins are allowed to belong to different groups corresponding to different biological processes. A comparison of the results obtained by our method with those of other well known clustering algorithms shows the capability of our approach to detect different and meaningful functional modules. 1
PINCoC: a Co-Clustering based Approach to Analyze Protein-Protein Interaction Networks
"... Abstract. A novel technique to search for functional modules in a protein-protein interaction network is presented. The network is represented by the adjacency matrix associated with the undirected graph modelling it. The algorithm introduces the concept of quality of a sub-matrix of the adjacency m ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Abstract. A novel technique to search for functional modules in a protein-protein interaction network is presented. The network is represented by the adjacency matrix associated with the undirected graph modelling it. The algorithm introduces the concept of quality of a sub-matrix of the adjacency matrix, and applies a greedy search technique for finding local optimal solutions made of dense submatrices containing the maximum number of ones. An initial random solution, constituted by a single protein, is evolved to search for a locally optimal solution by adding/removing connected proteins that best contribute to improve the quality function. Experimental evaluations carried out on Saccaromyces Cerevisiae proteins show that the algorithm is able to efficiently isolate groups of biologically meaningful proteins corresponding to the most compact sets of interactions. 1
Extending Consensus Clustering to Explore Multiple Clustering Views
"... Consensus clustering has emerged as an important extension of the classical clustering problem. Given a set of input clusterings of a given dataset, consensus clustering aims to find a single final clustering which is a better fit in some sense than the existing clusterings. There is a significant d ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Consensus clustering has emerged as an important extension of the classical clustering problem. Given a set of input clusterings of a given dataset, consensus clustering aims to find a single final clustering which is a better fit in some sense than the existing clusterings. There is a significant drawback in generating a single consensus clustering since different input clusterings could differ significantly. In this paper, we develop a new framework, called Multiple Consensus Clustering (MCC), to explore multiple clustering views of a given dataset from a set of input clusterings. Instead of generating a single consensus, MCC organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions. A dynamic programming algorithm is proposed to obtain a flat partition from the hierarchical tree using the modularity measure. Multiple consensuses are finally obtained by applying consensus clustering algorithms to each cluster of the partition. Extensive experimental results on 11 real world data sets and a case study on a Protein-Protein Interaction (PPI) data set demonstrate the effectiveness of our proposed method. 1
G: Weighted Consensus Clustering for Identifying Functional Modules in Protein-Protein Interaction Networks
- Proceedings of the 2009 International Conference on Machine Learning and Applications, ICMLA ‘09
"... In this article we present a new approach- weighted consensus clustering to identify the clusters in Protein-protein interaction (PPI) networks where each cluster corresponds to a group of functionally similar proteins. In weighed consensus clustering, different input clustering results weigh differ ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
In this article we present a new approach- weighted consensus clustering to identify the clusters in Protein-protein interaction (PPI) networks where each cluster corresponds to a group of functionally similar proteins. In weighed consensus clustering, different input clustering results weigh differently, i.e., a weight for each input clustering is introduced and the weights are automatically determined by an optimization process. We evaluate our proposed method with standard measures such as modularity, normalized mutual information (NMI) and the Gene Ontology (GO) consortium database and compare the performance of our approach with other consensus clustering methods. Experimental results demonstrate the effectiveness of our proposed approach. 1.