• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Complex networks as a unified framework for descriptive analysis and predictive modeling in climate science. Stat Anal Data Mining (2011)

by K Steinhaeuser, Chawla NV, Ganguly AR
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 12
Next 10 →

Comparing predictive power in climate data: Clustering matters

by Karsten Steinhaeuser, Nitesh V. Chawla, Auroop R. Ganguly - in Advances in Spatial and Temporal Databases
"... Abstract. Various clustering methods have been applied to climate, ecological, and other environmental datasets, for example to define climate zones, automate land-use classification, and similar tasks. Measuring the “goodness ” of such clusters is generally application-dependent and highly subjecti ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
Abstract. Various clustering methods have been applied to climate, ecological, and other environmental datasets, for example to define climate zones, automate land-use classification, and similar tasks. Measuring the “goodness ” of such clusters is generally application-dependent and highly subjective, often requiring domain expertise and/or validation with field data (which can be costly or even impossible to acquire). Here we focus on one particular task: the extraction of ocean climate indices from observed climatological data. In this case, it is possible to quantify the relative performance of different methods. Specifically, we propose to extract indices with complex networks constructed from climate data, which have been shown to effectively capture the dynamical behavior of the global climate system, and compare their predictive power to candidate indices obtained using other popular clustering methods. Our results demonstrate that network-based clusters are statistically significantly better predictors of land climate than any other clustering method, which could lead to a deeper understanding of climate processes and complement physics-based climate models. 1
(Show Context)

Citation Context

...at is the best clustering method for climate datasets? In response we posit that deriving clusters from complex networks, which have been shown to capture the dynamical behavior of the climate system =-=[8, 29,31,33,37]-=-, may be an effective approach. This raises the issue of evaluation and validity of discovered clusters as climate indices. As typical of any clustering task, evaluation is highly subjective, relying ...

Complex Networks in Climate Science: Progress, Opportunities and Challenges

by Karsten Steinhaeuser, Nitesh V. Chawla, Auroop, R. Ganguly - In Proceedings of the 2010 Conference on Intelligent Data Understanding, CIDU 2010, Mountain View , 2013
"... Abstract. Networks have been used to describe and model a wide range of complex systems, both natural as well as man-made. One particularly interesting application in the earth sciences is the use of complex networks to represent and study the global climate system. In this paper, we motivate this g ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Abstract. Networks have been used to describe and model a wide range of complex systems, both natural as well as man-made. One particularly interesting application in the earth sciences is the use of complex networks to represent and study the global climate system. In this paper, we motivate this general approach, explain the basic methodology, report on the state of the art (including our contributions), and outline open questions and opportunities for future research. 1.
(Show Context)

Citation Context

...ng sections, we describe the characteristics of the data and the network construction process in more detail. 2.1. Gridded Climate Data. The most commonly used data in climate network studies to date =-=[3, 4, 18, 19, 20, 21, 23, 24, 25, 32, 33]-=- stems from the NCEP/NCAR Reanalysis Project [9] (available for download at [27]). This dataset is created by assimilating remote and in-situ sensor measurements covering the entire globe and is widel...

Community detection in large-scale networks: a survey and empirical evaluation. WIREs Comput Stat

by Steve Harenberg , Gonzalo Bello , L Gjeltema , Stephen Ranshous , Jitendra Harlalka , Ramona Seay , Kanchana Padmanabhan , Nagiza Samatova , 2014
"... Community detection is a common problem in graph data analytics that consists of finding groups of densely connected nodes with few connections to nodes outside of the group. In particular, identifying communities in large-scale networks is an important task in many scientific domains. In this revi ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Community detection is a common problem in graph data analytics that consists of finding groups of densely connected nodes with few connections to nodes outside of the group. In particular, identifying communities in large-scale networks is an important task in many scientific domains. In this review, we evaluated eight state-of-the-art and five traditional algorithms for overlapping and disjoint community detection on large-scale real-world networks with known ground-truth communities. These 13 algorithms were empirically compared using goodness metrics that measure the structural properties of the identified communities, as well as performance metrics that evaluate these communities against the ground-truth. Our results show that these two types of metrics are not equivalent. That is, an algorithm may perform well in terms of goodness metrics, but poorly in terms of performance metrics, or vice versa.

Anomaly detection in dynamic networks: a survey

by Stephen Ranshous , Shitian Shen , Danai Koutra , Steve Harenberg , Christos Faloutsos , Nagiza F Samatova - Wiley Interdisciplinary Reviews: Computational Statistics , 2015
"... Anomaly detection is an important problem with multiple applications, and thus has been studied for decades in various research domains. In the past decade there has been a growing interest in anomaly detection in data represented as networks, or graphs, largely because of their robust expressivene ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Anomaly detection is an important problem with multiple applications, and thus has been studied for decades in various research domains. In the past decade there has been a growing interest in anomaly detection in data represented as networks, or graphs, largely because of their robust expressiveness and their natural ability to represent complex relationships. Originally, techniques focused on anomaly detection in static graphs, which do not change and are capable of representing only a single snapshot of data. As real-world networks are constantly changing, there has been a shift in focus to dynamic graphs, which evolve over time. In this survey, we aim to provide a comprehensive overview of anomaly detection in dynamic networks, concentrating on the state-of-the-art methods. We first describe four types of anomalies that arise in dynamic networks, providing an intuitive explanation, applications, and a concrete example for each. Having established an idea for what constitutes an anomaly, a general two-stage approach to anomaly detection in dynamic networks that is common among the methods is presented. We then construct a two-tiered taxonomy, first partitioning the methods based on the intuition behind their approach, and subsequently subdividing them based on the types of anomalies they detect. Within each of the tier one categories-community, compression, decomposition, distance, and probabilistic model based-we highlight the major similarities and differences, showing the wealth of techniques derived from similar conceptual approaches. © 2015 The Authors. financial systems connecting banks across the world, electric power grids connecting geographically distributed areas, and social networks that connect users, businesses, or customers using relationships such as friendship, collaboration, or transactional interactions. These are examples of dynamic networks, which, unlike static networks, are constantly undergoing changes to their structure or attributes. Possible changes include insertion and deletion of vertices (objects), insertion and deletion of edges (relationships), and modification of attributes (e.g., vertex or edge labels). WIREs Computational Statistics An important problem over dynamic networks is anomaly detection-finding objects, relationships, or

Spatially Penalized Regression for Extremes Dependence Analysis and Prediction: Case of Precipitation Extremes

by Debasish Das, Auroop R. Ganguly, Snigdhansu Chatterjee, Vipin Kumar, Zoran Obradovic
"... The inability to predict precipitation extremes under nonstationary climate remains a crucial science gap. Precipitation is not a state-variable within climate models, exhibits space-time heterogeneities, and is subject to thresholds and intermittences. Atmospheric variables in the spatiotemporal ne ..."
Abstract - Add to MetaCart
The inability to predict precipitation extremes under nonstationary climate remains a crucial science gap. Precipitation is not a state-variable within climate models, exhibits space-time heterogeneities, and is subject to thresholds and intermittences. Atmospheric variables in the spatiotemporal neighborhood, like temperature, humidity and updraft velocity, are often better predicted than precipitation from these models, and may have information relevant for precipitation extremes. Model-simulated atmospheric variables have been used to enhance model-predicted precipitation extremes in two ways: statistical downscaling routinely uses regression methods including neural networks and recently physics-based formulations have been developed. The former may not generalize under non-stationary climate while the latter is more interpretable but may not be able to discover or
(Show Context)

Citation Context

...s decisions. The data mining literature in climate applications have tended to focus on teleconnections (long-range spatial dependence), especially on oceanic influence over regional land climatology =-=[18, 19, 20]-=-. However, while teleconnections are important, local and regional atmospheric conditions typically tend to dominate in the context of climate-related extremes. This is an area where novel data mining...

unknown title

by Debasish Das, Auroop Ganguly, Arindam Banerjee, Zoran Obradovic
"... Towards understanding dominant processes in complex dynamical systems: Case of precipitation extremes ..."
Abstract - Add to MetaCart
Towards understanding dominant processes in complex dynamical systems: Case of precipitation extremes
(Show Context)

Citation Context

...works [17,18,19,21]. The dominant processes concept has been introduced in hydrology domain recently [20]. Climate has emerging as a new field of application for data mining methods. Complex networks =-=[10,11]-=- and sparse regression [5,6,7,12] were shown to be two useful tools for estimating dependence structures between different climate variables and indices. 2.1 Scope of Work and Main Contributions The m...

Descriptive Analysis of the Global Climate System and Predictive Modeling for Uncertainty Reduction in Climate Projections using Complex Networks

by Karsten Steinhaeuser
"... As evidence in support of anthropogenic climate change continues to mount [1], the study of climate has become a focus of scientific research, political attention, and socioeconomic concern. There are many different aspects to studying climate, from the collection and analysis of historical/observed ..."
Abstract - Add to MetaCart
As evidence in support of anthropogenic climate change continues to mount [1], the study of climate has become a focus of scientific research, political attention, and socioeconomic concern. There are many different aspects to studying climate, from the collection and analysis of historical/observed data to the understanding of current climate and its underlying physical processes and the development of models for

Empirical Comparison of Correlation Measures and Pruning Levels in Complex Networks Representing the Global Climate System

by Alex Pelan, Karsten Steinhaeuser, Nitesh V. Chawla, Dilkushi A. De Alwis Pitts, Auroop R. Ganguly
"... Abstract—Climate change is an issue of growing economic, social, and political concern. Continued rise in the average temperatures of the Earth could lead to drastic climate change or an increased frequency of extreme events, which would negatively affect agriculture, population, and global health. ..."
Abstract - Add to MetaCart
Abstract—Climate change is an issue of growing economic, social, and political concern. Continued rise in the average temperatures of the Earth could lead to drastic climate change or an increased frequency of extreme events, which would negatively affect agriculture, population, and global health. One way of studying the dynamics of the Earth’s changing climate is by attempting to identify regions that exhibit similar climatic behavior in terms of long-term variability. Climate networks have emerged as a strong analytics framework for both descriptive analysis and predictive modeling of the emergent phenomena. Previously, the networks were constructed using only one measure of similarity, namely the (linear) Pearson cross correlation, and were then clustered using a community detection algorithm. However, nonlinear dependencies are known to exist in climate, which begs the question whether more complex correlation measures are able to capture any such relationships. In this paper, we present a systematic study of different univariate measures of similarity and compare how each affects both the network structure as well as the predictive power of the clusters. I.
(Show Context)

Citation Context

...ables that lead to observed climate phenomena. Complex networks have already been established as an effective means of representation of the climate [1]–[4], both for descriptive and predictive tasks =-=[5]-=-. These networks are constructed from gridded climate data, wherein each vertex represents a grid point (physical location in space) and weighted edges represent the climatic similarity between them (...

unknown title

by Debasish Das, Auroop Ganguly, Arindam Banerjee, Zoran Obradovic
"... Towards understanding dominant processes in complex dynamical systems: Case of precipitation extremes ..."
Abstract - Add to MetaCart
Towards understanding dominant processes in complex dynamical systems: Case of precipitation extremes
(Show Context)

Citation Context

...works [17,18,19,21]. The dominant processes concept has been introduced in hydrology domain recently [20]. Climate has emerging as a new field of application for data mining methods. Complex networks =-=[10,11]-=- and sparse regression [5,6,7,12] were shown to be two useful tools for estimating dependence structures between different climate variables and indices. 2.1 Scope of Work and Main Contributions The m...

Learning Hierarchical Multi-label Classification Trees from Network Data

by Daniela Stojanova, Michelangelo Ceci, Donato Malerba
"... Abstract. We present an algorithm for hierarchical multi-label classifi-cation (HMC) in a network context. It is able to classify instances that may belong to multiple classes at the same time and consider the hierar-chical organization of the classes. It assumes that the instances are placed in a n ..."
Abstract - Add to MetaCart
Abstract. We present an algorithm for hierarchical multi-label classifi-cation (HMC) in a network context. It is able to classify instances that may belong to multiple classes at the same time and consider the hierar-chical organization of the classes. It assumes that the instances are placed in a network and uses information on the network connections during the learning of the predictive model. Many real world prediction problems have classes that are organized hierarchically and instances that can have pairwise connections. One example is web document classification, where topics (classes) are typically organized into a hierarchy and documents are connected by hyperlinks. Another example, which is considered in this paper, is gene/protein function prediction, where genes/proteins are connected and form protein-to-protein interaction (PPI) networks. Net-work datasets are characterized by a form of autocorrelation, where the value of a variable at a given node depends on the values of variables at the nodes it is connected with. Combining the hierarchical multi-label classification task with network prediction is thus not trivial and re-quires the introduction of the new concept of network autocorrelation for HMC. The proposed algorithm is able to profitably exploit network autocorrelation when learning a tree-based prediction model for HMC. The learned model is in the form of a Predictive Clustering Tree (PCT) and predicts multiple (hierarchically organized) labels at the leaves. Ex-periments show the effectiveness of the proposed approach for different problems of gene function prediction, considering different PPI networks. The results show that different networks introduce different benefits in different problems of gene function prediction. 1
(Show Context)

Citation Context

...a partial order which represents the superclass relationship. Note that each yi satisfies the hierarchical constraint : c ∈ yi ⇒ ∀c′ c : c′ ∈ yi. (1) 3.2 Network HMC Following Steinhaeuser et al. =-=[28]-=-, we view a training set as a single network of labeled nodes. Formally, the network is defined as an undirected edge-weighted graph G=(V,E), where V is the set of labeled nodes, while E ⊆ {〈u, v, w〉|...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University