## 2012a: Causal discovery for climate research using graphical models

Venue: | J. Climate |

Citations: | 5 - 1 self |

### Citations

3035 | The NCEP/NCAR 40-Year Reanalysis Project - Kalnay, Coauthors - 1996 |

1856 |
Investigating causal relations by econometric models and cross-spectral methods
- Granger
- 1969
(Show Context)
Citation Context ...provide the basis for formulating new hypotheses regarding the time scale and temporal sequencing of dynamical processes responsible for these connections. Last, the authors propose to use structure learning for climate networks, which are currently based primarily on correlation analysis. While correlation-based climate networks focus on similarity between nodes, independence graphs would provide an alternative viewpoint by focusing on information flow in the network. 1. Introduction One of the best known computational approaches to causality is the concept of Granger causality introduced by Granger (1969). A time series, X, Granger causes a second time series, Y, if past values of X contain information that helps predict future values of Y above and beyond the information contained in the past values of Y alone. Granger causality is implemented by first performing linear regression of the time series and then applying statistical tests on the regression coefficients. Granger causality is thus a measure for predictability based on a linear model and applies only to time series data. Reasoning about causality was put on a more general footing starting in the late 1980s through the introduction o... |

1394 | A Bayesian method for the induction of probabilistic networks from data
- Cooper, Herskovits
- 1992
(Show Context)
Citation Context ...t depends on the sample size. The more samples are available the more reliable the result. Finally, for the CI tests calculated without a VAR model, reliability declines rapidly with increasing number k of conditioning variables Z1, . . . , Zk, so large conditioning sets should be avoided. 3. Structure learning through CI tests There are two primary methods for structure learning. The first method is a score-based search that learns the graphs along with probabilities and uses some type of optimization routine to maximize the fit of the model. The most popular algorithm is the K2 algorithm by Cooper and Herskovitz (1992). Numerous other scorebased algorithms exist, see Neapolitan (2003). The second method, constraint-based learning, breaks the learning process of a graphical model up into two steps. First CI tests are used to learn as much as possible about the structure of the underlying graph. Once a graph structure is established the probability parameters are learned in the second step. To discover causal hypotheses we only care about the graph structure, so we can simply stop the learning process after the first step and thus never deal with any probability parameters. Both methods have been used success... |

1088 | Using Bayesian networks to analyze expression data - Friedman, Linial, et al. - 2000 |

404 |
Teleconnections in the geopotential height field during the Northern Hemisphere
- Wallace, Gutzler
- 1981
(Show Context)
Citation Context ...ausal relationships among four prominent modes of atmospheric low-frequency variability in boreal winter—namely, the Western Pacific Oscillation (WPO), Eastern Pacific Oscillation (EPO), Pacific–North America (PNA) pattern, and North Atlantic Oscillation (NAO). These modes, also known as ‘‘atmospheric teleconnections,’’ are characterized by synchronized 1 SEPTEMBER 2012 E B E R T - U P H O F F A N D D E N G 5655 low-frequency (longer than typical synoptic time scale of a week) fluctuations in the sea level pressure (SLP) or geopotential height fields at different geographical locations (e.g., Wallace and Gutzler 1981; Barnston and Livezey 1987). Some of these modes, for example, NAO and WPO are largely eddy driven (e.g., Benedict et al. 2004; Franzke et al. 2004; Martius et al. 2007; Riviere and Orlanski 2007; Woollings et al. 2008; Riviere 2010; Deng and Jiang 2011), while others such as PNA are partly eddy driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST an... |

312 |
Correlation and causation
- Wright
- 1921
(Show Context)
Citation Context ...es of Y alone. Granger causality is implemented by first performing linear regression of the time series and then applying statistical tests on the regression coefficients. Granger causality is thus a measure for predictability based on a linear model and applies only to time series data. Reasoning about causality was put on a more general footing starting in the late 1980s through the introduction of causal calculus (Rebane and Pearl 1987) and the use of probabilistic graphical models to represent causal relationships. The idea of representing causal structure in a graphical way goes back to Wright (1921, 1934) who defined path diagrams for structural equation models, a concept commonly used in economics to date. Pearl (1988) proposed the use of graphical models to represent probabilistic independence relationships between variables. This approach does not rely on temporal information, so it applies equally to nontemporal and time series data. Spirtes, Glymour, and Scheines (Spirtes et al. 1991, 1993) addressed the problem of detecting hidden common causes, which in turn allowed for causal interpretation of the graphs. These contributions by Pearl and Spirtes et al. laid the foundation for th... |

301 | Bayesian networks without tears
- Charniak
- 1991
(Show Context)
Citation Context ... and Y are neighbors in a directed graph and the arrow points from X to Y, then X is called a parent of Y and Y is called a child of X. Probabilistic graphical models combine tools from graph theory with probability theory. Such models are popular for systems containing uncertainty. The most common type is the Bayesian Network, also known as Bayes Net or Belief Network. A Bayesian Network model consists of a directed acyclic graph (DAG) and a probability distribution assigned to each node that defines the probability of the node’s state based on the states of its parents (for more details see Charniak 1991; Jensen and Nielsen 2007; Neapolitan 2003). The Markov Network, also known as Markov Random Field, is a probabilistic graphical model based on an undirected graph. A Markov network can represent certain dependencies that a Bayesian network cannot (such as bidirectional and cyclic dependencies); on the other hand, it cannot represent certain dependencies that a Bayesian network can—such as the v structures that will be discussed later (see Koller and Friedman 2009 for more details). Probabilistic graphical models provide an efficient way to represent joint probabilities, in particular if the l... |

270 |
Equivalence and synthesis of causal models
- Verma, Pearl
- 1990
(Show Context)
Citation Context ... temporal constraints, reduces algorithm complexity and increases the chances of obtaining a valid model. The more complex the model, the more important is it to incorporate any available expert knowledge. Many structure-learning algorithms, including most implementations of the PC algorithm, thus provide the capability of entering preknowledge, such as forced edges or forbidden edges. e. Markov equivalence and faithfulness of directed graphs Structure learning from observed data is only able to determine directed graphs up to an equivalence class, namely, the set of Markov equivalent graphs (Verma and Pearl 1990). This equivalence class may contain one or more graphs. Only an intervention analysis—where we actively manipulate the states of some variables in targeted experiments—can reveal additional causal relationships (see Pearl 2000; Murphy 2001). Two directed graphs are called Markov equivalent if they represent the same set of independence relationships. As it turns out, this equivalence can also be FIG. 2. Lung cancer example. 5654 J O U R N A L O F C L I M A T E VOLUME 25 expressed as follows. Two directed graphs are Markov equivalent if they have the same set of edges (ignoring the edge direct... |

195 | ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context
- Margolin, Nemenman, et al.
- 2006
(Show Context)
Citation Context ...tations that describe the potential causal pathways in the system. The most common type of graph used is a Bayesian network (Pearl 1988), which consists of two parts, a graph structure and probabilities, and all causal relationships are encoded in the graph structure. Causal discovery has already been applied with great success in disciplines ranging from the social sciences to computer science, engineering, medical diagnosis and bioinformatics (Spirtes et al. 2000; Neapolitan 2003). Many of the most successful examples in recent years come from the area of computational biology. For example, Margolin et al. (2006) and Friedman et al. (2000) trained Bayesian networks on expression data to identify protein/gene interaction, applying causal discovery to networks containing tens of thousands of nodes (Margolin et al. 2006). In climate science, Bayesian networks have been primarily used for purposes such as forecasting or as risk assessment or decision-making tools, not to generate causal hypotheses. Since here we are more interested in learning potential causal relationships (i.e., graph structure of Bayesian networks) than quantifying probabilities, we categorize the following discussion of the relevant l... |

117 | An algorithm for fast recovery of sparse causal graphs
- Spirtes, Glymour
- 1991
(Show Context)
Citation Context ...orks for precipitation forecasting and all of them use modifications of the K2 algorithm, a score-based structure learning algorithm. The third and final category uses causal discovery methods for structure learning. For example, Chu et al. (2005) apply structure learning to find the causal structure among time series of remote geospatial indices of ocean surface temperatures and pressures. Chu and Glymour (2008) apply similar methods to study the relationships between four ocean climate indices. Both studies focus on extending standard causal discovery algorithms [such as the PC algorithm by Spirtes and Glymour (1991) used here] to develop causal models based on nonlinear time series. Other work in the third category includes Kennett (2000) (see also Kennett et al. 2001), which derives models for sea breeze prediction using some of the same causal discovery algorithms applied in this paper, although the end product of Kennett’s research is again a model for prediction, not causal hypotheses. While the work discussed above—with the exception of Chu et al. (2005) and Chu and Glymour (2008)—consider static models, Cossention et al. (2001) develops a temporal (a.k.a. dynamic) Bayesian network for air pollution... |

101 |
A Nonlinear Dynamical Perspective on Climate Prediction
- Palmer
- 1999
(Show Context)
Citation Context ... driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST anomalies) that excite these teleconnections and also to understand dynamical/physical processes that determine their life cycle characteristics (e.g., feedback from synoptic-eddy momentum and heat flux; for an excellent review of this topic, please see Dole 2008). Additionally, it is pointed out by Palmer (1999) that to obtain correct time-mean response to enhanced CO2 forcing in a climate model, the model should have quasi-stationary regimes (i.e., modes of low-frequency variability) that share structural similarity with those in the real atmosphere. Here we take a different perspective and explore the potential causal relationships among these four modes. These relationships, if confirmed, would serve as basis for formulating hypotheses regarding the dynamics that connect these modes, and these hypotheses can be further tested with general circulation models (GCMs). Specifically, we developed two t... |

95 |
Impulse Response Functions Based on a Causal Approach to Residual Orthogonalization in Vector Autoregressions,”
- Granger, Swanson
- 1997
(Show Context)
Citation Context ...hat detect cause–effect Corresponding author address: Yi Deng, School of Earth and Atmospheric Sciences, Georgia Institute of Technology, 311 Ferst Drive, Atlanta, GA 30332-0340. E-mail: yi.deng@eas.gatech.edu. 5648 J O U R N A L O F C L I M A T E VOLUME 25 DOI: 10.1175/JCLI-D-11-00387.1 2012 American Meteorological Society relationships from observational data (Spirtes et al. 2000; Pearl 2000; Neapolitan 2003; Koller and Friedman 2009). Even Granger later incorporated Pearl’s graph approach, calculating graphs based on Granger causality tests for multivariate time series regression models (Swanson and Granger 1997; Eichler 2007). These models are also known as Graphical Granger models (Arnold et al. 2007). The intent of this paper is to provide an introduction to causal discovery using graphical models for researchers in climate science and to demonstrate their use for an example in climate science. Causal discovery algorithms generate one or more graph representations that describe the potential causal pathways in the system. The most common type of graph used is a Bayesian network (Pearl 1988), which consists of two parts, a graph structure and probabilities, and all causal relationships are encoded ... |

60 | Active learning of causal bayes net structure
- Murphy
- 2001
(Show Context)
Citation Context ...g most implementations of the PC algorithm, thus provide the capability of entering preknowledge, such as forced edges or forbidden edges. e. Markov equivalence and faithfulness of directed graphs Structure learning from observed data is only able to determine directed graphs up to an equivalence class, namely, the set of Markov equivalent graphs (Verma and Pearl 1990). This equivalence class may contain one or more graphs. Only an intervention analysis—where we actively manipulate the states of some variables in targeted experiments—can reveal additional causal relationships (see Pearl 2000; Murphy 2001). Two directed graphs are called Markov equivalent if they represent the same set of independence relationships. As it turns out, this equivalence can also be FIG. 2. Lung cancer example. 5654 J O U R N A L O F C L I M A T E VOLUME 25 expressed as follows. Two directed graphs are Markov equivalent if they have the same set of edges (ignoring the edge direction) and the same set of v structures. For example, the three directed graphs in Fig. 1d form a Markov equivalence class, and it is not possible to further narrow down which graph is the correct one without performing intervention experiment... |

55 |
The recovery of causal polytrees from statistical data.
- Rebane, Pearl
- 1987
(Show Context)
Citation Context ... Granger causes a second time series, Y, if past values of X contain information that helps predict future values of Y above and beyond the information contained in the past values of Y alone. Granger causality is implemented by first performing linear regression of the time series and then applying statistical tests on the regression coefficients. Granger causality is thus a measure for predictability based on a linear model and applies only to time series data. Reasoning about causality was put on a more general footing starting in the late 1980s through the introduction of causal calculus (Rebane and Pearl 1987) and the use of probabilistic graphical models to represent causal relationships. The idea of representing causal structure in a graphical way goes back to Wright (1921, 1934) who defined path diagrams for structural equation models, a concept commonly used in economics to date. Pearl (1988) proposed the use of graphical models to represent probabilistic independence relationships between variables. This approach does not rely on temporal information, so it applies equally to nontemporal and time series data. Spirtes, Glymour, and Scheines (Spirtes et al. 1991, 1993) addressed the problem of d... |

53 |
Hail£nder: A Bayesian System for Forecasting Severe Weather’.
- Abramson, Brown, et al.
- 1996
(Show Context)
Citation Context ...been primarily used for purposes such as forecasting or as risk assessment or decision-making tools, not to generate causal hypotheses. Since here we are more interested in learning potential causal relationships (i.e., graph structure of Bayesian networks) than quantifying probabilities, we categorize the following discussion of the relevant literature by the level of structure learning taking place. Work in the first category derives the structure of the Bayesian network directly from expert knowledge, and only probabilities are learned from data. A good example is the Hailfinder project by Abramson et al. (1996), which was one of the first applications of Bayesian networks related to climate science. Hailfinder is a Bayesian network for the prediction of severe weather events in northern Colorado. Catenacci and Giuppomi (2009) review the use of Bayesian networks to model and express uncertainty in climate change to aid policy development. Peter et al. (2009) develop a Bayesian network that links the impacts of projected climate change in southern Africa to irrigated agriculture, water storage planning, and biofuel production. All of the above belong to the first category of learning. Furthermore, Bay... |

47 |
A synoptic view of the North Atlantic Oscillation.
- Benedict, Lee, et al.
- 2004
(Show Context)
Citation Context ...c Oscillation (WPO), Eastern Pacific Oscillation (EPO), Pacific–North America (PNA) pattern, and North Atlantic Oscillation (NAO). These modes, also known as ‘‘atmospheric teleconnections,’’ are characterized by synchronized 1 SEPTEMBER 2012 E B E R T - U P H O F F A N D D E N G 5655 low-frequency (longer than typical synoptic time scale of a week) fluctuations in the sea level pressure (SLP) or geopotential height fields at different geographical locations (e.g., Wallace and Gutzler 1981; Barnston and Livezey 1987). Some of these modes, for example, NAO and WPO are largely eddy driven (e.g., Benedict et al. 2004; Franzke et al. 2004; Martius et al. 2007; Riviere and Orlanski 2007; Woollings et al. 2008; Riviere 2010; Deng and Jiang 2011), while others such as PNA are partly eddy driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST anomalies) that excite these teleconnections and also to understand dynamical/physical processes that determine their life cycle ... |

35 |
Is the North Atlantic Oscillation a breaking wave?
- Franzke, Lee, et al.
- 2004
(Show Context)
Citation Context ...astern Pacific Oscillation (EPO), Pacific–North America (PNA) pattern, and North Atlantic Oscillation (NAO). These modes, also known as ‘‘atmospheric teleconnections,’’ are characterized by synchronized 1 SEPTEMBER 2012 E B E R T - U P H O F F A N D D E N G 5655 low-frequency (longer than typical synoptic time scale of a week) fluctuations in the sea level pressure (SLP) or geopotential height fields at different geographical locations (e.g., Wallace and Gutzler 1981; Barnston and Livezey 1987). Some of these modes, for example, NAO and WPO are largely eddy driven (e.g., Benedict et al. 2004; Franzke et al. 2004; Martius et al. 2007; Riviere and Orlanski 2007; Woollings et al. 2008; Riviere 2010; Deng and Jiang 2011), while others such as PNA are partly eddy driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST anomalies) that excite these teleconnections and also to understand dynamical/physical processes that determine their life cycle characteristics (e.g.... |

28 |
Granger causality and path diagrams for multivariate time series.
- Eichler
- 2007
(Show Context)
Citation Context ...rresponding author address: Yi Deng, School of Earth and Atmospheric Sciences, Georgia Institute of Technology, 311 Ferst Drive, Atlanta, GA 30332-0340. E-mail: yi.deng@eas.gatech.edu. 5648 J O U R N A L O F C L I M A T E VOLUME 25 DOI: 10.1175/JCLI-D-11-00387.1 2012 American Meteorological Society relationships from observational data (Spirtes et al. 2000; Pearl 2000; Neapolitan 2003; Koller and Friedman 2009). Even Granger later incorporated Pearl’s graph approach, calculating graphs based on Granger causality tests for multivariate time series regression models (Swanson and Granger 1997; Eichler 2007). These models are also known as Graphical Granger models (Arnold et al. 2007). The intent of this paper is to provide an introduction to causal discovery using graphical models for researchers in climate science and to demonstrate their use for an example in climate science. Causal discovery algorithms generate one or more graph representations that describe the potential causal pathways in the system. The most common type of graph used is a Bayesian network (Pearl 1988), which consists of two parts, a graph structure and probabilities, and all causal relationships are encoded in the graph st... |

23 |
Breaking waves at the tropopause in the wintertime Northern Hemisphere: Climatological analyses of the orientation and the theoretical LC1/2 classification.
- Martius, Schwierz, et al.
- 2007
(Show Context)
Citation Context ...ation (EPO), Pacific–North America (PNA) pattern, and North Atlantic Oscillation (NAO). These modes, also known as ‘‘atmospheric teleconnections,’’ are characterized by synchronized 1 SEPTEMBER 2012 E B E R T - U P H O F F A N D D E N G 5655 low-frequency (longer than typical synoptic time scale of a week) fluctuations in the sea level pressure (SLP) or geopotential height fields at different geographical locations (e.g., Wallace and Gutzler 1981; Barnston and Livezey 1987). Some of these modes, for example, NAO and WPO are largely eddy driven (e.g., Benedict et al. 2004; Franzke et al. 2004; Martius et al. 2007; Riviere and Orlanski 2007; Woollings et al. 2008; Riviere 2010; Deng and Jiang 2011), while others such as PNA are partly eddy driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST anomalies) that excite these teleconnections and also to understand dynamical/physical processes that determine their life cycle characteristics (e.g., feedback from synop... |

22 |
The backbone of the climate network.
- Donges, Zou, et al.
- 2009
(Show Context)
Citation Context ...ools from network analysis to the field of climate science. Their basic idea is to use atmospheric fields—or other physical quantities—to define a correlation network of nodes, where each node represents a point on a global grid. Any two nodes are connected if the cross correlation of the data associated with those two nodes is beyond a threshold ccmin. Since these correlation networks were introduced to climate science in 2004, there has been a flurry of research activity in this area, discussing definition, calculation, evaluation, and interpretation of climate networks (Tsonis et al. 2006; Donges et al. 2009). Several research groups related global network changes over a longtime scale to El Nino activity (Tsonis et al. 2007; Tsonis and Swanson 2008; Gozolchiani et al. 2008; Yamasaki et al. 2008, 2009). A summary of the progress, opportunities, and challenges of networks in climate science was presented by Steinhaeuser et al. (2010). While most climate networks are defined as correlation networks, two other definitions have recently been proposed, mutual information (MI) networks (Donges et al. 2009) and phase synchronization networks (Yamasaki et al. 2009). All three network definitions, however... |

22 |
The architecture of the climate network.
- Tsonis, Roebber
- 2004
(Show Context)
Citation Context ...e personally prefer the second method because we find its decision-making process more transparent, and we never have to deal with the probabilities. Thus in the remainder of this paper we focus on constraint-based learning as method for structure learning. We denote the directed and undirected graphs obtained through structure learning as independence graphs because they represent the (conditional) independence relationships. In the four-mode example discussed in section 4 we are most interested in directed graphs, while for other types of climate applications such as climate networks (e.g., Tsonis and Roebber 2004; Tsonis et al. 2006) we may be more interested in undirected graphs. Thus structure learning for both directed and undirected graphs is reviewed here. a. Footprints of causal relationships in data To recover potential causal relationships from data we need to learn to read their footprints, that is, the traces they leave in the data. There are two main concepts to understand: (i) the difference between direct and indirect connections and (ii) so called v structures Section 3b illustrates the first of these concepts, and section 3c illustrates the second. b. Testing for direct connections To u... |

21 |
Topology and Predictability of El Niño and La Niña Networks
- Tsonis, Swanson
(Show Context)
Citation Context ...elation network of nodes, where each node represents a point on a global grid. Any two nodes are connected if the cross correlation of the data associated with those two nodes is beyond a threshold ccmin. Since these correlation networks were introduced to climate science in 2004, there has been a flurry of research activity in this area, discussing definition, calculation, evaluation, and interpretation of climate networks (Tsonis et al. 2006; Donges et al. 2009). Several research groups related global network changes over a longtime scale to El Nino activity (Tsonis et al. 2007; Tsonis and Swanson 2008; Gozolchiani et al. 2008; Yamasaki et al. 2008, 2009). A summary of the progress, opportunities, and challenges of networks in climate science was presented by Steinhaeuser et al. (2010). While most climate networks are defined as correlation networks, two other definitions have recently been proposed, mutual information (MI) networks (Donges et al. 2009) and phase synchronization networks (Yamasaki et al. 2009). All three network definitions, however, decide whether an edge exists between two nodes in the network based only on a test involving those two nodes and the results are fairly simil... |

21 |
Climate Networks around the Globe are Significantly Affected by El Niño
- Yamasaki, Gozolchiani, et al.
(Show Context)
Citation Context ...de represents a point on a global grid. Any two nodes are connected if the cross correlation of the data associated with those two nodes is beyond a threshold ccmin. Since these correlation networks were introduced to climate science in 2004, there has been a flurry of research activity in this area, discussing definition, calculation, evaluation, and interpretation of climate networks (Tsonis et al. 2006; Donges et al. 2009). Several research groups related global network changes over a longtime scale to El Nino activity (Tsonis et al. 2007; Tsonis and Swanson 2008; Gozolchiani et al. 2008; Yamasaki et al. 2008, 2009). A summary of the progress, opportunities, and challenges of networks in climate science was presented by Steinhaeuser et al. (2010). While most climate networks are defined as correlation networks, two other definitions have recently been proposed, mutual information (MI) networks (Donges et al. 2009) and phase synchronization networks (Yamasaki et al. 2009). All three network definitions, however, decide whether an edge exists between two nodes in the network based only on a test involving those two nodes and the results are fairly similar for all three. We believe that using indepen... |

20 | A new Rossby wave–breaking interpretation of the North Atlantic Oscillation.
- Woollings, Hoskins, et al.
- 2008
(Show Context)
Citation Context ... and North Atlantic Oscillation (NAO). These modes, also known as ‘‘atmospheric teleconnections,’’ are characterized by synchronized 1 SEPTEMBER 2012 E B E R T - U P H O F F A N D D E N G 5655 low-frequency (longer than typical synoptic time scale of a week) fluctuations in the sea level pressure (SLP) or geopotential height fields at different geographical locations (e.g., Wallace and Gutzler 1981; Barnston and Livezey 1987). Some of these modes, for example, NAO and WPO are largely eddy driven (e.g., Benedict et al. 2004; Franzke et al. 2004; Martius et al. 2007; Riviere and Orlanski 2007; Woollings et al. 2008; Riviere 2010; Deng and Jiang 2011), while others such as PNA are partly eddy driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST anomalies) that excite these teleconnections and also to understand dynamical/physical processes that determine their life cycle characteristics (e.g., feedback from synoptic-eddy momentum and heat flux; for an excellent r... |

18 | Learning high-dimensional directed acyclic graphs with latent and selection variables
- Colombo, Maathuis, et al.
(Show Context)
Citation Context ...ondition is sometimes hard to meet in practice because there are often many variables, from ENSO to solar flares, that can have a common influence on variables under consideration. It may be impossible to include them all because of complexity and because some of them cannot even be observed. Algorithms such as the fast causal inference (FCI) algorithm developed by Spirtes and Glymour (1991) can identify the presence of these latent variables under certain conditions but are of high computational complexity and currently not yet feasible for large graphs. Improvements have been suggested, see Colombo et al. (2012), and may help in the future. For now we take the pragmatic approach of using the PC algorithm and interpreting the results accordingly. Namely, we need to consider the possibility that any link detected by the PC algorithm may either present a direct causal connection, be due to a common cause, or a combination of the two. That is why we call the results from the analysis ‘‘causal hypotheses,’’ and they must be tested one by one by a domain expert. The contribution of the causal discovery process as described here is therefore to reduce the number of causal hypotheses to a manageable set that... |

13 |
Characteristics of the Atlantic storm-track eddy activity and its relation with the North Atlantic Oscillation.
- Riviere, Orlanski
- 2007
(Show Context)
Citation Context ... (PNA) pattern, and North Atlantic Oscillation (NAO). These modes, also known as ‘‘atmospheric teleconnections,’’ are characterized by synchronized 1 SEPTEMBER 2012 E B E R T - U P H O F F A N D D E N G 5655 low-frequency (longer than typical synoptic time scale of a week) fluctuations in the sea level pressure (SLP) or geopotential height fields at different geographical locations (e.g., Wallace and Gutzler 1981; Barnston and Livezey 1987). Some of these modes, for example, NAO and WPO are largely eddy driven (e.g., Benedict et al. 2004; Franzke et al. 2004; Martius et al. 2007; Riviere and Orlanski 2007; Woollings et al. 2008; Riviere 2010; Deng and Jiang 2011), while others such as PNA are partly eddy driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST anomalies) that excite these teleconnections and also to understand dynamical/physical processes that determine their life cycle characteristics (e.g., feedback from synoptic-eddy momentum and heat f... |

11 | Bayesian networks for probabilistic weather prediction.
- Cofino, Cano, et al.
- 2002
(Show Context)
Citation Context ...a Bayesian network that links the impacts of projected climate change in southern Africa to irrigated agriculture, water storage planning, and biofuel production. All of the above belong to the first category of learning. Furthermore, Bayesian networks are used in these cases to represent and use known causal relationships rather than to discover causal relationships. Work in the second category learns the structure of the Bayesian networks from data using score-based learning algorithms for the purpose of forecasting purposes and do not focus on discovering causal relationships. The works of Cofino et al. (2002), Cano et al. (2004), and Lee and Joseph (2006) fall into this category. All three of them develop Bayesian networks for precipitation forecasting and all of them use modifications of the K2 algorithm, a score-based structure learning algorithm. The third and final category uses causal discovery methods for structure learning. For example, Chu et al. (2005) apply structure learning to find the causal structure among time series of remote geospatial indices of ocean surface temperatures and pressures. Chu and Glymour (2008) apply similar methods to study the relationships between four ocean cli... |

10 |
Pattern of climate network blinking links follws El Nino events.
- Gozolchiani, Yamasako, et al.
- 2008
(Show Context)
Citation Context ...k of nodes, where each node represents a point on a global grid. Any two nodes are connected if the cross correlation of the data associated with those two nodes is beyond a threshold ccmin. Since these correlation networks were introduced to climate science in 2004, there has been a flurry of research activity in this area, discussing definition, calculation, evaluation, and interpretation of climate networks (Tsonis et al. 2006; Donges et al. 2009). Several research groups related global network changes over a longtime scale to El Nino activity (Tsonis et al. 2007; Tsonis and Swanson 2008; Gozolchiani et al. 2008; Yamasaki et al. 2008, 2009). A summary of the progress, opportunities, and challenges of networks in climate science was presented by Steinhaeuser et al. (2010). While most climate networks are defined as correlation networks, two other definitions have recently been proposed, mutual information (MI) networks (Donges et al. 2009) and phase synchronization networks (Yamasaki et al. 2009). All three network definitions, however, decide whether an edge exists between two nodes in the network based only on a test involving those two nodes and the results are fairly similar for all three. We beli... |

9 | Search for additive nonlinear time series causal models.
- Chu, Glymour
- 2008
(Show Context)
Citation Context ...rposes and do not focus on discovering causal relationships. The works of Cofino et al. (2002), Cano et al. (2004), and Lee and Joseph (2006) fall into this category. All three of them develop Bayesian networks for precipitation forecasting and all of them use modifications of the K2 algorithm, a score-based structure learning algorithm. The third and final category uses causal discovery methods for structure learning. For example, Chu et al. (2005) apply structure learning to find the causal structure among time series of remote geospatial indices of ocean surface temperatures and pressures. Chu and Glymour (2008) apply similar methods to study the relationships between four ocean climate indices. Both studies focus on extending standard causal discovery algorithms [such as the PC algorithm by Spirtes and Glymour (1991) used here] to develop causal models based on nonlinear time series. Other work in the third category includes Kennett (2000) (see also Kennett et al. 2001), which derives models for sea breeze prediction using some of the same causal discovery algorithms applied in this paper, although the end product of Kennett’s research is again a model for prediction, not causal hypotheses. While th... |

7 |
An idealized model study relevant to the dynamics of the midwinter minimum of the Pacific storm track.
- Deng, Mak
- 2005
(Show Context)
Citation Context ...s that would explain those connections and thus support the hypotheses. For example, the following chain of events can be envisioned as a plausible explanation for the WPO / EPO connection: 1) phase transition in WPO (induced either by anomalous tropical SST forcing or high-latitude blocking, e.g., Woollings et al. 2008; Dole 2008) is closely coupled to changes in the intensity/location of the subtropical jet; 2) variability in the subtropical jet leads to changes in the property (track, strength, etc.) of synoptic eddies of the Pacific storm track that is located downstream of the jet (e.g., Deng and Mak 2005, 2006); 3) anomalous eddy forcing (in terms of vorticity and/or heat flux) drives the geopotential height tendency characteristic of phase transition in EPO. On the other hand, the even stronger EPO / WPO connection with a 3-day delay could be reflecting the fact that WPO is largely eddydriven with forcing mostly originating in the central-eastern Pacific where synoptic eddies attain their maximum intensity, break, and trigger first a phase transition in EPO. The above hypothesis regarding the WPO–EPO connection can be readily tested through controlled experiments with an idealized atmospheri... |

7 |
Different ENSO teleconnections and their effects of the stratospheric polar vortex.
- Garfinkel, Hartmann
- 2008
(Show Context)
Citation Context ...static and temporal model, might be a demonstration of the role of transient eddy forcing (especially over the North American continent) in bridging the variability of two eddy-driven modes over the North Pacific and the North Atlantic (e.g., Li and Lau 2012). The last of the strong links identified in the temporal model, NAO to PNA with a delay of 3 to 6 days, is a new discovery. Previous studies focusing on dynamical processes linking ENSO variability and strength of stratospheric polar vortex have hinted a connection between the two but with an opposite direction, that is, PNA / NAO (e.g., Garfinkel and Hartmann 2008; Hegyi and Deng 2011). The link found here through causal discovery methods, including the time lag, however, is consistent with the result from a recent and independent study that utilized rotated empirical orthogonal function (REOF) analysis (Baxter and Nigam 2012). Whether this connection reflects a downstream, circum-hemispheric modulation of NAO variability on PNA remains to be investigated with a dynamical model. f. Comparison to correlation graphs As correlation graphs are much more common in climate science—they are the standard model for climate network—it is a legitimate question wh... |

5 | Data Driven Methods for Nonlinear Granger Causality: Climate Teleconnection Mechanisms, - Chu, Danks, et al. - 2005 |

3 | Role of Rossby wave breaking in the west Pacific teleconnection. - Riviere - 2010 |

3 | Complex networks in climate science: Progress, opportunities and challenges.
- Steinhaeuser, Chawla, et al.
- 2010
(Show Context)
Citation Context ...es is beyond a threshold ccmin. Since these correlation networks were introduced to climate science in 2004, there has been a flurry of research activity in this area, discussing definition, calculation, evaluation, and interpretation of climate networks (Tsonis et al. 2006; Donges et al. 2009). Several research groups related global network changes over a longtime scale to El Nino activity (Tsonis et al. 2007; Tsonis and Swanson 2008; Gozolchiani et al. 2008; Yamasaki et al. 2008, 2009). A summary of the progress, opportunities, and challenges of networks in climate science was presented by Steinhaeuser et al. (2010). While most climate networks are defined as correlation networks, two other definitions have recently been proposed, mutual information (MI) networks (Donges et al. 2009) and phase synchronization networks (Yamasaki et al. 2009). All three network definitions, however, decide whether an edge exists between two nodes in the network based only on a test involving those two nodes and the results are fairly similar for all three. We believe that using independence graphs based on structure learning for climate networks would yield networks with significantly fewer edges by eliminating indirect co... |

3 | Linking Granger causality and the Pearl causal model with settable systems.
- White, Chalak, et al.
- 2011
(Show Context)
Citation Context ...ason partial correlation is used in the case study in section 4, which deals with continuous variables. For a definition of partial correlation, see for example Kachigan (1991). A special case is as follows: if only time series data is considered and a temporal causal model is desired and no significant preknowledge is available, then the CI test using partial correlation becomes quite similar to the Granger causality test for multivariate time series. (For a discussion of the subtle differences between the concept of Granger causality and Pearls causal model applied for time series data, see White et al. 2011.) In fact one can use the approach by Swanson and Granger (1997) as a short cut to evaluate the CI tests in this case. Their approach is to first calculate a vector autoregression (VAR) model from the data, which describes the current state of all variables in terms of the past evolution of all variables. The coefficients of the VAR model can be used to calculate the partial correlation of each node pair with the linear influence of all other variables removed. The process involves inverting the covariance matrix so care must be taken that it is not close to singular, especially if there are ... |

2 | A conditional independence algorithm for learning undirected graphical models.
- Borgelt
- 2010
(Show Context)
Citation Context ...les for the sake of simplicity, the above definitions generalize to continuous variables. We saw an example of a conditional independence relationship in the match example above (Fire is conditionally independent of SPaper given Temp). In this example the conditional independence was concluded from our understanding of the physical problem. However, in structure learning we want to learn unknown conditional independencies in a system based on data. For that we need tests for independence and CI. A great variety of measures can be used to test for independence and conditional independence, see Borgelt (2010) for a review. Ideally, any such measure is supposed to yield a value of zero if the variables are (conditionally) independent and nonzero otherwise. In statistics the traditional choice is cross correlation as measure for independence and partial correlation for conditional independence. In theory, partial correlation is an ideal CI FIG. 1. Match Example. 1 SEPTEMBER 2012 E B E R T - U P H O F F A N D D E N G 5651 measure only if all variables involved are multivariate Gaussian, but in practice it seems to provide a decent approximation in most cases. Partial correlation has the important adv... |

2 | Applications of bayesian networks in meteorology.
- Cano, Sordo, et al.
- 2004
(Show Context)
Citation Context ...t links the impacts of projected climate change in southern Africa to irrigated agriculture, water storage planning, and biofuel production. All of the above belong to the first category of learning. Furthermore, Bayesian networks are used in these cases to represent and use known causal relationships rather than to discover causal relationships. Work in the second category learns the structure of the Bayesian networks from data using score-based learning algorithms for the purpose of forecasting purposes and do not focus on discovering causal relationships. The works of Cofino et al. (2002), Cano et al. (2004), and Lee and Joseph (2006) fall into this category. All three of them develop Bayesian networks for precipitation forecasting and all of them use modifications of the K2 algorithm, a score-based structure learning algorithm. The third and final category uses causal discovery methods for structure learning. For example, Chu et al. (2005) apply structure learning to find the causal structure among time series of remote geospatial indices of ocean surface temperatures and pressures. Chu and Glymour (2008) apply similar methods to study the relationships between four ocean climate indices. Both s... |

2 | Bayesian models of the pm 10 atmospheric urban pollution.
- Cossention, Raimondi, et al.
- 2001
(Show Context)
Citation Context ...ing standard causal discovery algorithms [such as the PC algorithm by Spirtes and Glymour (1991) used here] to develop causal models based on nonlinear time series. Other work in the third category includes Kennett (2000) (see also Kennett et al. 2001), which derives models for sea breeze prediction using some of the same causal discovery algorithms applied in this paper, although the end product of Kennett’s research is again a model for prediction, not causal hypotheses. While the work discussed above—with the exception of Chu et al. (2005) and Chu and Glymour (2008)—consider static models, Cossention et al. (2001) develops a temporal (a.k.a. dynamic) Bayesian network for air pollution prediction for the city of Palermo, using expert knowledge and trial-and-error to develop the structure. Since the early 1980s, the amount of meteorological and climate data collected has been growing every year, probably exponentially (Kenward 2011). In addition to traditional meteorological measurements of local pressure, wind, temperature, and humidity, ground- and space-based remote sensing instruments such as Doppler radar and satellites monitor the states of clouds, precipitation, sea ice coverage, aerosol concentra... |

2 |
A dynamical fingerprint of tropical Pacific sea surface temperatures on the decadal-scale variability of cool-season arctic precipitation.
- Hegyi, Deng
- 2011
(Show Context)
Citation Context ...ght be a demonstration of the role of transient eddy forcing (especially over the North American continent) in bridging the variability of two eddy-driven modes over the North Pacific and the North Atlantic (e.g., Li and Lau 2012). The last of the strong links identified in the temporal model, NAO to PNA with a delay of 3 to 6 days, is a new discovery. Previous studies focusing on dynamical processes linking ENSO variability and strength of stratospheric polar vortex have hinted a connection between the two but with an opposite direction, that is, PNA / NAO (e.g., Garfinkel and Hartmann 2008; Hegyi and Deng 2011). The link found here through causal discovery methods, including the time lag, however, is consistent with the result from a recent and independent study that utilized rotated empirical orthogonal function (REOF) analysis (Baxter and Nigam 2012). Whether this connection reflects a downstream, circum-hemispheric modulation of NAO variability on PNA remains to be investigated with a dynamical model. f. Comparison to correlation graphs As correlation graphs are much more common in climate science—they are the standard model for climate network—it is a legitimate question whether similar informat... |

2 | Applying Bayesian modelling to assess climate change effects on biofuel production. - Peter, Lange, et al. - 2009 |

1 |
Pentad analysis of wintertime PNA development and its relationship to the NAO.
- Baxter, Nigam
- 2012
(Show Context)
Citation Context ...the strong links identified in the temporal model, NAO to PNA with a delay of 3 to 6 days, is a new discovery. Previous studies focusing on dynamical processes linking ENSO variability and strength of stratospheric polar vortex have hinted a connection between the two but with an opposite direction, that is, PNA / NAO (e.g., Garfinkel and Hartmann 2008; Hegyi and Deng 2011). The link found here through causal discovery methods, including the time lag, however, is consistent with the result from a recent and independent study that utilized rotated empirical orthogonal function (REOF) analysis (Baxter and Nigam 2012). Whether this connection reflects a downstream, circum-hemispheric modulation of NAO variability on PNA remains to be investigated with a dynamical model. f. Comparison to correlation graphs As correlation graphs are much more common in climate science—they are the standard model for climate network—it is a legitimate question whether similar information could have been obtained for this application using a correlation graph. Thus we constructed correlation graphs corresponding to the temporal independence graphs for D 5 3 and time slices 215 to 15. We use the same nodes, but any pair of node... |

1 | Potentials of bayesian networks to deal with uncertainty in climate change adaptation policies. Centro Euro-Mediterraneo per i Cambiamenti Climatici (CMCC)
- Catenacci, Giuppomi
- 2009
(Show Context)
Citation Context ... (i.e., graph structure of Bayesian networks) than quantifying probabilities, we categorize the following discussion of the relevant literature by the level of structure learning taking place. Work in the first category derives the structure of the Bayesian network directly from expert knowledge, and only probabilities are learned from data. A good example is the Hailfinder project by Abramson et al. (1996), which was one of the first applications of Bayesian networks related to climate science. Hailfinder is a Bayesian network for the prediction of severe weather events in northern Colorado. Catenacci and Giuppomi (2009) review the use of Bayesian networks to model and express uncertainty in climate change to aid policy development. Peter et al. (2009) develop a Bayesian network that links the impacts of projected climate change in southern Africa to irrigated agriculture, water storage planning, and biofuel production. All of the above belong to the first category of learning. Furthermore, Bayesian networks are used in these cases to represent and use known causal relationships rather than to discover causal relationships. Work in the second category learns the structure of the Bayesian networks from data us... |

1 |
Intraseasonal modulation of the North Pacific storm track by tropical convection in boreal winter.
- Jiang
- 2011
(Show Context)
Citation Context ...des, also known as ‘‘atmospheric teleconnections,’’ are characterized by synchronized 1 SEPTEMBER 2012 E B E R T - U P H O F F A N D D E N G 5655 low-frequency (longer than typical synoptic time scale of a week) fluctuations in the sea level pressure (SLP) or geopotential height fields at different geographical locations (e.g., Wallace and Gutzler 1981; Barnston and Livezey 1987). Some of these modes, for example, NAO and WPO are largely eddy driven (e.g., Benedict et al. 2004; Franzke et al. 2004; Martius et al. 2007; Riviere and Orlanski 2007; Woollings et al. 2008; Riviere 2010; Deng and Jiang 2011), while others such as PNA are partly eddy driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST anomalies) that excite these teleconnections and also to understand dynamical/physical processes that determine their life cycle characteristics (e.g., feedback from synoptic-eddy momentum and heat flux; for an excellent review of this topic, please see Dole ... |

1 |
Linking weather and climate. Synoptic-Dynamic Meteorology and Weather Analysis and Forecasting: A Tribute to Fred Sanders,
- Dole
- 2008
(Show Context)
Citation Context ...2011), while others such as PNA are partly eddy driven and partly associated with anomalous tropical convective heating, which is often tied to tropical sea surface temperature (SST) variations (e.g., Franzke et al. 2011). To improve the skill of extended-range weather forecasting, it is crucial to identify external factors (e.g., tropical SST anomalies) that excite these teleconnections and also to understand dynamical/physical processes that determine their life cycle characteristics (e.g., feedback from synoptic-eddy momentum and heat flux; for an excellent review of this topic, please see Dole 2008). Additionally, it is pointed out by Palmer (1999) that to obtain correct time-mean response to enhanced CO2 forcing in a climate model, the model should have quasi-stationary regimes (i.e., modes of low-frequency variability) that share structural similarity with those in the real atmosphere. Here we take a different perspective and explore the potential causal relationships among these four modes. These relationships, if confirmed, would serve as basis for formulating hypotheses regarding the dynamics that connect these modes, and these hypotheses can be further tested with general circulati... |

1 | Synoptic analysis of the Pacific– North American teleconnection pattern. - Feldstein, Lee - 2011 |

1 |
Multivariate Statistical Analysis. 3rd ed.
- Kachigan
- 1991
(Show Context)
Citation Context ...hoice is mutual information as measure for independence and conditional mutual information for conditional independence. Those measures do not rely on any assumptions on the variables and tend to be a good choice for variables that are discrete by nature. However, they do not readily apply to continuousvalued variables and often do not work well if a variable must be discretized first, especially for coarse discretizations. For this reason partial correlation is used in the case study in section 4, which deals with continuous variables. For a definition of partial correlation, see for example Kachigan (1991). A special case is as follows: if only time series data is considered and a temporal causal model is desired and no significant preknowledge is available, then the CI test using partial correlation becomes quite similar to the Granger causality test for multivariate time series. (For a discussion of the subtle differences between the concept of Granger causality and Pearls causal model applied for time series data, see White et al. 2011.) In fact one can use the approach by Swanson and Granger (1997) as a short cut to evaluate the CI tests in this case. Their approach is to first calculate a ... |

1 | cited 2011: Data storm: What to do with all this climate information? [Available online at http://www. climatecentral.org/blogs/data-storm-what-to-do-with-allthis-climate-information/.] - Kenward - 2001 |

1 |
Diagnostic and dynamical analyses of two outstanding aspects of storm tracks.
- Mak, Deng
- 2006
(Show Context)
Citation Context ...nomalous eddy forcing (in terms of vorticity and/or heat flux) drives the geopotential height tendency characteristic of phase transition in EPO. On the other hand, the even stronger EPO / WPO connection with a 3-day delay could be reflecting the fact that WPO is largely eddydriven with forcing mostly originating in the central-eastern Pacific where synoptic eddies attain their maximum intensity, break, and trigger first a phase transition in EPO. The above hypothesis regarding the WPO–EPO connection can be readily tested through controlled experiments with an idealized atmospheric GCM (e.g., Mak and Deng 2006). The EPO to NAO connection, as seen in both the static and temporal model, might be a demonstration of the role of transient eddy forcing (especially over the North American continent) in bridging the variability of two eddy-driven modes over the North Pacific and the North Atlantic (e.g., Li and Lau 2012). The last of the strong links identified in the temporal model, NAO to PNA with a delay of 3 to 6 days, is a new discovery. Previous studies focusing on dynamical processes linking ENSO variability and strength of stratospheric polar vortex have hinted a connection between the two but with ... |

1 |
Available online at http://www.cs.
- Neapolitan
- 2003
(Show Context)
Citation Context ...al interpretation of the graphs. These contributions by Pearl and Spirtes et al. laid the foundation for the field of causal discovery and thus jump started the development of a myriad of algorithms that detect cause–effect Corresponding author address: Yi Deng, School of Earth and Atmospheric Sciences, Georgia Institute of Technology, 311 Ferst Drive, Atlanta, GA 30332-0340. E-mail: yi.deng@eas.gatech.edu. 5648 J O U R N A L O F C L I M A T E VOLUME 25 DOI: 10.1175/JCLI-D-11-00387.1 2012 American Meteorological Society relationships from observational data (Spirtes et al. 2000; Pearl 2000; Neapolitan 2003; Koller and Friedman 2009). Even Granger later incorporated Pearl’s graph approach, calculating graphs based on Granger causality tests for multivariate time series regression models (Swanson and Granger 1997; Eichler 2007). These models are also known as Graphical Granger models (Arnold et al. 2007). The intent of this paper is to provide an introduction to causal discovery using graphical models for researchers in climate science and to demonstrate their use for an example in climate science. Causal discovery algorithms generate one or more graph representations that describe the potential ... |

1 | From probability to causality. - Scheines - 1991 |

1 | What do networks have to do with climate? - Roebber - 2006 |

1 | A new dynamical mechanism for major climate shifts. - Kravtsov - 2007 |