## Theory and Evidence in International Conflict: A Response

### BibTeX

@MISC{Marchi_theoryand,

author = {To De Marchi and Langche Zeng},

title = {Theory and Evidence in International Conflict: A Response},

year = {}

}

### OpenURL

### Abstract

In this article, we show that de Marchi, Gelpi, and Grynaviski’s substantive analyses are fully consistent with our prior theoretical conjecture about international conflict. We note that they also agree with our main methodological point that out-of-sample forecasting performance should be a primary standard used to evaluate international conflict studies. However, we demonstrate that all other methodological conclusions drawn by de Marchi, Gelpi, and Gryanaviski are false. For example, by using the same evaluative criterion for both models, it is easy to see that their claim that properly specified logit models outperform neural network models is incorrect. Finally, we show that flexible neural network models are able to identify important empirical relationships between democracy and conflict that the logit model excludes a priori; this should not be surprising since the logit model is merely a limiting special case of the neural network model. We thank Scott de Marchi, Christopher Gelpi, and Jeffrey Grynaviski (2004; hereafter dGG) for their careful attention to our work (Beck, King, and Zeng 2000; hereafter BKZ) and for raising some important methodological issues that we agree

### Citations

9816 | Statistical Learning Theory
- Vapnik
- 1998
(Show Context)
Citation Context ...r prediction, these include neural networks, models of intermediate flexiblity like generalized additive models (Beck and Jackman 1998), those that have unique optima such as support vector machines (=-=Vapnik 1995-=-, 1998), and other types of models and methods such as boosting, regression and classification trees, kernel methods, and mixture models (Hastie, Tibshirani, and Friedman 2001). For estimating causal ... |

5299 |
Neural Networks for pattern recognition
- Bishop
- 1995
(Show Context)
Citation Context ...l networks typically have more parameters than logit models, but Bayesian regularization reduces this nominal number of parameters to an “effective number of parameters” that is usually much smaller (=-=Bishop 1995-=-, 377, 410). Of course, the number of parameters per se is not the right criterion by which to judge a model, for the complexity of a model should match that of the data. A model too simple to extract... |

2292 | The Elements of Statistical Learning - Hastie - 2001 |

812 |
Cross-validatory choice and assessment of statistical predictions
- Stone
(Show Context)
Citation Context ...e conventional in-sample criterion, 3 Out-of-sample forecasting and cross-validation in general are routinely used techniques in modern statistical modeling. For an early reference on this topic, see =-=Stone 1974-=-. we would think that this variable is an important predictor of peace. However, when we compare the outof-sample forecasting performance of the dGG logit with and that without this variable, we find ... |

156 | Forecasting Output and Inflation: The Role of Asset Prices
- Stock, Watson
- 2003
(Show Context)
Citation Context ...election procedures and the better are the forecasts on average. This result is unlikely to have arisen by chance, as it is consistent with findings from other fields in unrelated applications (e.g., =-=Stock and Watson 2003-=-). 386American Political Science Review Vol. 98, No. 2 FIGURE 2. Surface and Contour Plots for the Probability of War Note: Surface and contour plots for the probability of war given differing levels... |

66 |
Artificial Neural Networks: Approximation and Learning Theory
- White
- 1992
(Show Context)
Citation Context ...ough it has a parametric form that is almost as straightforward as logit, neural network models do have advantages over logit. They, but not logit models, have “arbitrary approximation capabilities” (=-=White 1992-=-). This means that at least one member of the neural network family of models (or a neural network model with a sufficient number of hidden neurons) can approximate any functional form suggested by th... |

44 |
Strategic Interaction and the Statistical Analysis
- Signorino
- 1999
(Show Context)
Citation Context ...formal theories of international conflict are massively violated by the restrictions of logit models, especially that logit probabilities are usually monotonic functions of the explanatory variables (=-=Signorino 1999-=-). Moreover, Signorino and Yilmaz (2003) prove that if even the simplest form of strategic interaction exists among the dyads, then the restrictions inherent in logit models make its estimates “biased... |

28 |
Beyond Linearity by Default: Generalized Additive Models
- Beck, Jackman
- 1998
(Show Context)
Citation Context ...tions inherent in logit and the other techniques commonly used in political science. For prediction, these include neural networks, models of intermediate flexiblity like generalized additive models (=-=Beck and Jackman 1998-=-), those that have unique optima such as support vector machines (Vapnik 1995, 1998), and other types of models and methods such as boosting, regression and classification trees, kernel methods, and m... |

19 | When can history be our guide? The pitfalls of counterfactual inference
- King, Zeng
(Show Context)
Citation Context ...ie, Tibshirani, and Friedman 2001). For estimating causal effects, in areas where preand posttreatment controls are clearly distinguishable, the techniques tend to be matching and related approaches (=-=King and Zeng 2003-=-). Which of these techniques is appropriate will depend on the application, but in almost all situations these techniques will dominate those with restrictive functional forms like logit, except in th... |

16 |
Improving Forecasts of State Failure', World Politics 53(4): 623−658
- King, Zeng
- 2001
(Show Context)
Citation Context ...efficients directly and instead report predicted values, first differences, or other quantities of interest (Andreou and Zombanakis 2001; Bearce 2000; Beck, King, and Zeng 2000; Borisyuk et al. 2001; =-=King and Zeng 2002-=-; Lagazio and Russet 2002; Zeng 1999, 2000). We see no reason to think that a predicted probability of war is any harder to interpret, no matter how it is calculated. Neural networks are certainly les... |

13 | Improving Quantitative - Beck, King, et al. - 2000 |

5 | The Joint Democracy–Dyadic Conflict Nexus: A
- Reuveny, Li
- 2003
(Show Context)
Citation Context ...transformations or anything close to it. In fact, no other published or unpublished work we could find, whether or not it was cited in dGG, used this specification. Even the most recent publications (=-=Reuveny and Li 2003-=-; Russett, Oneal, and Berbaum 2003) and the most recent working paper by one of the authors (Gelpi and Grieco 2000) chose more traditional specifications, very much unlike the one in dGG. Consider dem... |

4 | A Neural Network Analysis of Militarized International Disputes, 19885-1992: Temporal Stability and Causal Complexity - Lagazio, Russett - 2004 |

4 |
The Effects of Political Similarity on the Onset
- Werner
- 2000
(Show Context)
Citation Context ..., and Sanchex-Terry 2002). A plausible alternative theory we now provide some evidence for is that “likes don’t fight,” in that dyads with different levels of democracy are those likely to go to war (=-=Werner 2000-=-). Because these issues remain highly controversial in the literature and remain the subject of a considerable research program, we think that dGG’s specification—which makes some of these results imp... |

3 |
A neural network measurement of relative military security: The case of Greece and Cyprus
- Andreou, Zombanakis
- 2001
(Show Context)
Citation Context ... logit models, political scientists who use neural network models do not interpret their coefficients directly and instead report predicted values, first differences, or other quantities of interest (=-=Andreou and Zombanakis 2001-=-; Bearce 2000; Beck, King, and Zeng 2000; Borisyuk et al. 2001; King and Zeng 2002; Lagazio and Russet 2002; Zeng 1999, 2000). We see no reason to think that a predicted probability of war is any hard... |

1 |
Forecasting the 2001 General Election Result: A Neural Network Approach.” http://www.psa.ac.uk/spgrp/ epop/forecasting genelect2001.htm
- Borisyuk, Borisyuk, et al.
- 2001
(Show Context)
Citation Context ...not interpret their coefficients directly and instead report predicted values, first differences, or other quantities of interest (Andreou and Zombanakis 2001; Bearce 2000; Beck, King, and Zeng 2000; =-=Borisyuk et al. 2001-=-; King and Zeng 2002; Lagazio and Russet 2002; Zeng 1999, 2000). We see no reason to think that a predicted probability of war is any harder to interpret, no matter how it is calculated. Neural networ... |

1 |
Neural Network Toolboox for MATLAB, User’s Guide.” http://www.mathworks.com
- Demuth, Beale
- 2002
(Show Context)
Citation Context ...data by the complete estimation procedure. Even the user’s manual for the software dGG 381Theory and Evidence in International Conflict May 2004 use warns against naive parameter counting like this (=-=Demuth and Beale 2002-=-, 5–54). By the appropriate use of Bayesian prior densities—and most importantly the use of test sets during the training stage rather than merely in-sample t-tests—neural network models enable resear... |

1 |
Classification and Prediction with Neural Network Models
- Zeng
- 1999
(Show Context)
Citation Context ...ted values, first differences, or other quantities of interest (Andreou and Zombanakis 2001; Bearce 2000; Beck, King, and Zeng 2000; Borisyuk et al. 2001; King and Zeng 2002; Lagazio and Russet 2002; =-=Zeng 1999-=-, 2000). We see no reason to think that a predicted probability of war is any harder to interpret, no matter how it is calculated. Neural networks are certainly less familiar to political scientists, ... |