Forecasting Papers

These papers deal primarily with the methodological and practical applications of events data to various forecasting models.

Forecasting Civil Conflict with Zero-Inflated Count Models

Benjamin E. Bagozzi (Penn State)

This article demonstrates the advantages of using zero-inflated count models to forecast civil conflict. To do so, negative binomial (NB) and zero-inflated negative binomial (ZINB) count models are applied to a novel country-month event-count dataset of rebel and government-initiated violent conflicts. Using out-of-sample forecasts to compare model predictions of the conflicts occurring within these data, we find that moving from a NB model to a zero-inflated count model can produce up-to an 13% improvement in civil conflict forecasting accuracy. We also find that including (1-3 month) lagged values of monthly conflict frequency in the inflation stage of our zeroinflated conflict models can lead to as much as a 12% improvement in conflict forecasting accuracy. Substantively our findings suggest that, while past values of government and rebel initiated conflict are indeed positively related to present values, the magnitude of this positive relationship tends to be overstated when zero-inflation is not accounted for.

Link to Adobe .pdf file of the paper

Racing Horses: Constructing and Evaluating Forecasts in Political Science

Patrick Brandt (University of Texas, Dallas), John Freeman (University of Minnesota) and Philip Schrodt (Penn State)

This is an extensive review of various metrics used to evaluate forecasts, as well as a more general discussion of the issues involved in comparing forecast models. The paper illustrates how to use the new suite of designs and tools to improve competitions between forecasting models, first in a stylized Monte Carlo analysis and then in a horse race between conflict early warning models for the China-Taiwan "Cross-Straits" dyad. The Monte Carlo analyses highlight the pitfalls of relying on point forecasts and conventional metrics like root-mean-squared-error to pick winners. The conflict early warning investigation compares the performance of a collection of time series models including univariate autoregressions (AR) and multiequation models including Bayesian, and Markov-Switching Bayesian vector autoregressive models (VAR, BVAR, MS-BVAR models, respectively).

Paper presented at the 28th Summer Meeting of the Society for Political Methodology, Princeton University, July 2011.

Link to Adobe .pdf file of the paper

Forecasting Political Conflict in Asia using Latent Dirichlet Allocation Models

Philip Schrodt

Latent Dirichlet allocation models are a relatively new computational classification algorithm. In its standard application to document classification, the model assumes each document to be composed of a mixture of multiple, overlapping topics each with a typical set of words, and classification is done by associating words in a document with the latent topics most likely to have generated the observed distribution of those words. I apply this technique to the problem of political forecasting by assuming that the stream of events observed between a dyad of actors is a mixture of a variety of di erent political strategies and standard operating procedures (for example escalation of repressive measures against a minority group while simultaneously making efforts to co-opt the elites of that group). By identifying the dominant strategies being pursued at time t, one gets information that can be used to forecast likely patterns of interaction at a later time t + k. This approach is applied to event data generated for 29 Asian countries in the Integrated Conflict Early Warning System project for 1998-2010 to forecast the ICEWS conflict measures for rebellion, insurgency, ethno-religious violence, domestic political conflict and international conflict at a six month lead time. In random samples balancing the occurrence of negative and positive outcomes on the dependent variable, LDA combined with a logistic model predicts with around 60% to 70% accuracy in in-sample evaluation, and improves very substantially on the sensitivity of the classification compared with simple logistic models in full samples. A supervised version of LDA, however, does not provide much improvement over theunsupervised version, and shows some pathological behaviors. Some structure can be found in the factors

Paper presented at the European Political Science Association, Dublin, May 2011.

Link to Adobe .pdf file of the paper

Link to version presented at the NSF Conference on New Horizons in the Analysis of the Middle East, University of South Carolina, October 2011, which adds an analysis of the Levant data set.

Predicting Intra-State Conflict Onset: An Event Data Approach Using Euclidean And Levenshtein Distance Measures

Vito D'Orazio, James E. Yonamine and Philip Schrodt (Penn State)

In this paper, we utilize both a new event data dataset and a novel empirical approach to test for the extent of similarities between event sequences preceding con ict onset. The event data were compiled as part of the Defense Advanced Research Project Agency (DARPA) funded Integrated Conflict Early Warning Systems program (ICEWS; O'Brien [2010]) and consist of machine-coded events between politically relevant actors in 29 Asian countries from 1998-2010, derived from electronic articles reported in 27 regional and international electronic news sources. Our methodological innovation is the use of Levenshtein and Euclidean distance measures to assess the level of similarity between two 12-week sequences of event counts aggregated at the state-week and state-month level. Distances are calculated between all possible pairs of the 53 sequences within our dataset that precede a conflict onset and approximately 100 sequences that precede peaceful period with no domestic conflict. We find that sequences preceding peace exhibit highly similar event count structures while those preceding a conflict onset vary considerably from each other and, importantly, from periods preceding peace. Based on this, we calculate a Baseline Peace Archetype (BPA) that reflects that average event count structure of a sequence preceding peace. We then calculate the distances measures between all sequences preceding conflict and a random sample of those preceding peace against the BPA sequence. These distances allow us to generate out-of-sample forecasts one and two months into the future with greater than 75 percent accuracy.

Paper prepared for delivery at the Midwest Political Science Association, Chicago, April 2011.

Link to Adobe .pdf file of the paper

The Effects of Domestic Conflict on Interstate Conflict: An Event Data Analysis of Monthly Level Onset and Intensity

James E. Yonamine (Penn State)

Existing event data studies all suggest that temporally nuanced, continuous measures of both domestic and interstate conflict are needed in order to appropriately test for the range of potential effects that domestic conflict may have on interstate conflict. This paper will addresses this problem by generating monthly level, continuous data reflecting the number of conflictual events at both the state-month (for domestic events) and the dyad-month (for interstate events) level based on the Integrated Conflict Early Warning System (ICEWS) event dataset, which contains over 2 million domestic and interstate events for 29 Asian countries from 1997 to 2010. Using this data, I perform numerous linear and non-linear tests for the e ects of domestic conflict onset and intensity on the onset and intensity of interstate conflict across a range of operationalizations of onset and intensity at the monthly level.

M.A. thesis, Penn State, May 2011.

Link to Adobe .pdf file of the paper

Real Time, Time Series Forecasting of Inter- and Intra-State Political Conflict

Patrick Brandt (University of Texas, Dallas), John Freeman (University of Minnesota) and Philip Schrodt (Penn State)

We propose a framework for forecasting and analyzing regional and international conflicts. It generates forecasts that 1) are accurate but accounts for uncertainty, 2) are produced in (near) real time, 3) address the simultaneous actors' behaviors, 4) incorporate prior beliefs, and 5) generate policy contingent forecasts. We combine the CAMEO event-coding framework with Markov-switching and Bayesian vector autoregression models to meet these goals. Our example produces a series for forecasts for material conflict between the Israelis and Palestinians for 2010. Our forecast is that the level of material conflict between these belligerents will increase in 2010, compared to 2009.

Paper originally prepared for delivery at the Annual Meeting of the International Studies Association, New York, March 2009.

Link to Adobe .pdf file of the paper

Predicting Risk Factors Associated with Forced Migration: An Early Warning Model of Haitian Flight

Stephen Shellman (University of Georgia) and Brandon Stewart (William and Mary)

While most forced migration studies focus on explanation, this study focuses on prediction. The study predicts forced migration events by predicting the civil violence, poor economic conditions, and foreign interventions known to cause individuals to flee their homes in search of refuge. By accounting for the interaction between civil conflict intensity levels, the ebb and flow of origin and potential host countries' economies, and impinging foreign policy pressures on countries' governments and dissidents, the model can better predict the occurrence and magnitude of forced migration events. Policy makers can use these predictions to aid their planning for humanitarian crises. If we can predict forced migration, we can better plan for humanitarian crises. While the study is limited to predicting Haitian flight to the United States, its strength is its ability to predict weekly flows as opposed to annual flows, providing a greater level of predictive detail than its "country-year" counterparts. Given the model's performance, the study calls for the collection of disaggregated data in additional countries to provide more precise and useful early warning models of forced migrant events.

Paper prepared for delivery at the Annual Meeting of the International Studies Association, San Diego, March 2006.

Link to Adobe .pdf file of the paper

Forecasting Israeli-Palestinian Conflict with Hidden Markov Models

Robert Shearer (Center for Army Analysis)

This paper presents research into conflict analysis, utilizing Hidden Markov models to capture the patterns of escalation in a conflict and Markov chains to forecast future escalations. HiddenMarkov models have an extensive history in a wide variety of pattern classification applications. In these models, an unobserved finite state Markov chain generates observed symbols whose distribution is conditioned on the current state of the chain. Training algorithms estimate model parameters based upon known patterns of symbols. Assignment rules classify unknown patterns according to the likelihood of known models generating the observed symbols. The research presented here utilized much of the Hidden Markov model methodology, but not for pattern classification, rather to identify the underlying finite state Markov chain for a symbol realization. Machine coded newswire story leads provided event data that served as the symbol realization for the Hidden Markov model. Fundamental matrices derived from the Markov chain led to forecasts that provide insight into the dynamic behavior of the conflict and describe potential futures of the conflict in probabilistic terms, to include the likelihood of conflict, the time to conflict, and the time in conflict.

Link to Adobe .pdf file of the paper

A New Kind of Social Science: The Path Beyond Current (IR) Methodologies May Lie Beneath Them

Valerie M. Hudson, Philip A. Schrodt, and Ray D. Whitmer

Existing formal models of political behavior have followed the lead of the natural sciences and generally focused on methods that use continuous-variable mathematics. Stephen Wolfram has recently produced an extended critique of that approach in the natural sciences, and suggested that a great deal of natural behavior can be accounted for using rules that involve discrete patterns. Wolfram's work generally does not consider models in the social sciences but given the similarity between many of the techniques for modeling in the natural and social sciences, his critique can readily be applied to models of social behavior as well. We argue further that pattern-based models are particularly relevant to modeling human behavior because human cognitive abilities are far more developed in the domain of pattern recognition than in the domain of continuous-variable mathematics. We test the possibility of finding pattern-based behavior in international behavior by looking at event data for the Israel-Palestine conflict for the period 1979-2003. Using a new web-based tool explicitly designed for the analysis of event data patterns, we experiment with three general patterns: the classic tit-for-tat, an "olive branch" pattern designed to detect attempts at de-escalation, and four "meta-rules" that look at the relationship between prior conflict and the propensity of the actors to engage in reciprocal behavior. Our analysis shows that these patterns can be found repeatedly in the data, and their frequency corresponds to changes in the qualitative characteristics of the conflict.

Paper prepared for delivery at the Annual Meeting of the International Studies Association, Montreal, Quebec, Canada, March 2004.

Link to Adobe .pdf file of the paper

The analytical web site for this project (formerly at, which includes the graphic tools for analyzing event patterns is


Using Event Data to Monitor Contemporary Conflict in the Israel-Palestine Dyad

Philip A. Schrodt, Deborah J. Gerner, and Ömür Yilmaz

For the past eighteen months, the Kansas Event Data System (KEDS) project has been using event data and other web-based sources to produce quarterly reports on the Israel-Palestine conflict for the swisspeace (Swiss Peace Foundation) FAST Project, which is sponsored by Swiss Agency for Development and Cooperation and a number of non-governmental organizations. This paper describes the indicators that we are monitoring, the process we have developed to generate the reports, and the supplemental sources we are using. We address the issue of the differences between newspaper and news wire reports with respect to "media fatigue" effects and also analyze some of the strengths and weaknesses of this approach to conflict monitoring.

Paper prepared for delivery at the Annual Meeting of the International Studies Association, Montreal, Quebec, Canada, March 2004.

forthcoming, 2005. International Studies Perspectives

Link to Adobe .pdf file of the paper

Link to detailed list of steps used to update the FAST data

Forecasts and Contingencies: From Methodology to Policy

Philip A. Schrodt

A "folk criticism" in political science maintains that the discipline should confine its efforts to explanation and avoid venturing down the dark, dirty, and dangerous path to forecasting and prediction. I argue that not only is this position inconsistent with the experiences of other sciences, but in fact the questions involved in making robust and valid predictions invoke many core methodological issues in political analysis. Those issues include, among others, the question of the level of predictability in political behavior, the problem of case selection in small-N situations, and the various alternative models that could be used to formalize predictions. This essay focuses on the problem of forecasting in international politics, and concludes by noting some of the problems of institutional culture - bureaucratic and academic - that have inhibited greater use of systematic forecasting methods in foreign policy.

Paper presented at the theme panel "Political Utility and Fundamental Research: The Problem of Pasteur's Quadrant" at the American Political Science Association meetings, Boston, 29 August - 1 September 2002

Link to Adobe .pdf file of the paper

Forecasting Conflict in the Balkans using Hidden Markov Models

Philip A. Schrodt

This study uses hidden Markov models (HMM) to forecast conflict in the former Yugoslavia for the period January 1991 through January 1999. The political and military events reported in the lead sentences of Reuters news service stories were coded into the World Events Interaction Survey (WEIS) event data scheme. The forecasting scheme involved randomly selecting eight 100-event "templates" taken at a 1-, 3- or 6-month forecasting lag for high-conflict and low-conflict weeks. A separate HMM is developed for the high-conflict-week sequences and the low-conflict-week sequences. Forecasting is done by determining whether a sequence of observed events fit the high-conflict or low-conflict model with higher probability.

Models were selected to maximize the difference between correct and incorrect predictions, evaluated by week. Three weighting schemes were used: unweighted (U), penalize false positives (P) and penalize false negatives (N). There is a relatively high level of convergence in the estimates -- the best and worst models of a given type vary in accuracy by only about 15% to 20%. In full-sample tests, the U and P models produce at overall accuracy of around 80%. However, these models correctly forecast only about 25% of the high-conflict weeks, although about 60% of the cases where a high-conflict week has been forecast turn out to have high conflict. In contrast, the N model has an overall accuracy of only about 50% in full-sample tests, but it correctly forecasts high-conflict weeks with 85% accuracy in the 3- and 6-month horizon and 92% accuracy in the 1-month horizon. However, this is achieved by excessive predictions of high-conflict weeks: only about 30% of the cases where a high-conflict week has been forecast are high-conflict. Models that use templates from only the previous year usually do about as well as models based on the entire sample.

The models are remarkably insensitive to the length of the forecasting horizon -- the drop-off in accuracy at longer forecasting horizons is very small, typically around 2%-4%. There is also no clear difference in the estimated coefficients for the 1-month and 6-month models. An extensive analysis was done of the coefficient estimates in the full-sample model to determine what the model was "looking at" in order to make predictions. While a number of statistically significant differences exist between the high and low conflict models, these do not fall into any neat patterns. This is probably due to a combination of the large number of parameters being estimated, the multiple local maxima in the estimation surface, and the complications introduced by the presence of a number of very low probability event categories. Some experiments with simplified models indicate that it is possible to use models with substantially fewer parameters without markedly decreasing the accuracy of the predictions; in fact predictions of the high conflict periods actually increase in accuracy quite substantially.

Paper presented at the American Political Science Association meetings, 31 August - 3 September, 2000

Link to Adobe .pdf version of this paper

The Impact of Early Warning on Institutional Responses to Complex Humanitarian Crises

Philip A. Schrodt and Deborah J. Gerner

This paper considers the problems of institutional response to the early warning of complex humanitarian crises (CHCs). We start with a typology of six different modes of early warning failure: strategic deception, conventional concealment, institutional ignorance, reflexive response, exogenous shifts, and systemic complexity. We discuss the extent to which each of these can affect the early warning of CHCs. We then consider the problems of cognitive, bureaucratic, and political constraints to effective early warning. The paper concludes that the early warning of CHCs is likely to remain decentralized in academic, nongovernmental (NGO) and intergovernmental (IGO) projects, but that because of increases in the availability of information this decentralization does not necessarily preclude effective early warning, and may in fact enhance it. There is, however, a need to augment the credibility, visibility, and efficacy of these efforts, as is being done through efforts such as Forum for Early Warning and Emergency Response (FEWER) and ReliefWeb.

Paper presented at the Third Pan-European International Relations Conference and Joint Meeting with the International Studies Association, Vienna, 16 - 19 September 1998

Link to Adobe .pdf version of this paper

Early Warning of Conflict in Southern Lebanon using Hidden Markov Models

Philip A. Schrodt
in Harvey Starr, ed. The Understanding and Management of Global Violence:
New Approaches to Theory and Research on Protracted Conflict
, pp. 131-162.
New York: St. Martin's Press, 1999.

This paper extends earlier work on the use of hidden Markov models (HMMs) to the problem of forecasting international conflict. HMMs are a sequence comparison method that is widely used in computerized speech recognition; they are easily be adapted to work with sequences of international event data. The HMM is a computationally efficient method of generalizing a set of example sequences observed in a noisy environment. The paper provides a theoretical "micro-foundation" for the sequence comparison approach based on co-adaptation of standard operating procedures.

The left-right (LR) HMM used in speech recognition problems is first extended to a left-right-left (LRL) model that allows a crisis to escalate and de-escalate. This model is tested for its ability to correctly discriminate between BCOW crisis that do and do not involve war. The LRL model provides slightly more accurate classification than the LR model. The interpretation of the hidden states in the LRL models, however, is more ambiguous than found in the LR model.

The HMM is then applied to the problem of forecasting the outbreak of armed violence between Israel and Arab forces in south Lebanon during the period 1979 to 1997 (excluding 1982-1985). An HMM first is estimated using six cases of "tit-for-tat" escalation, then fitted to the entire time period. The model identifies about half of the TFT conflicts -- including all of the training cases -- that occur in the full sequence, with only one false positive. This result suggests that HMMs could be used in an event-driven continuous monitoring system. However, the fit of the model is very sensitive to the number of nonevents found in a sequence, and consequently that measure is ineffective as an early warning indicator.

In a subset of models the maximum likelihood estimate of the sequence of hidden Markov states characterizing a sequence provides a robust early warning indicator with a three to six-month lead. These models are valid in a split-sample test, and the patterns of cross-correlation of the individual states of the model are consistent with the theoretical expectations. While this approach clearly needs further validation, it appears promising.

The paper concludes with observations on the extent to which the HMM approach to be generalized to various categories of conflict, some suggestions on how the method of estimation can be improved, and the implications that sequence-based forecasting techniques have for the theoretical understanding of the causes of conflict.

This paper was presented at the annual meetings of the American Political Science Association, Washington, DC, August 1997.

Link to supplementary graphics

Link to Adobe .pdf version of "A Landscape Model of Rule-Based Co-Adaptation in International Behavior

Left-right-left hidden Markov model source code and data files (.sit)
Left-right-left hidden Markov model source code and data files (.zip)

Pattern Recognition of International Crises using Hidden Markov Models

Philip A. Schrodt
in Diana Richards, ed. Political Complexity: Nonlinear Models of Politics, pp. 296-328.
Ann Arbor: University of Michigan Press, 2000.

Event data are one of the most widely used indicators in quantitative international relations research. To date, most of the models using event data have constructed numerical indicators based on the characteristics of the events measured in isolation and then aggregated. An alternative approach is to use quantitative pattern recognition techniques to compare an existing sequence of behaviors to a set of similar historical cases. This has much in common with human reasoning by historical analogy while providing the advantages of systematic and replicable analysis possible using machine-coded event data and statistical models.

This chapter uses "hidden Markov models" -- a recently developed sequence-comparison technique widely used in computational speech recognition -- to measure similarities among international crises. The models are first estimated using the Behavioral Correlates of War data set of historical crises, then applied to an event data set covering political behavior in the contemporary Middle East for the period April 1979 through February 1997.

A split-sample test of the hidden Markov models perfectly differentiates crises involving war from those not involving war in the cases used to estimate the models. The models also provide a high level of discrimination in a set of test cases not used in the estimated, and most of the erroneously-classified cases have plausible distinguishing features. The difference between the war and nonwar models also correlates significantly with a scaled measure of conflict in the contemporary Middle East. This suggests that hidden Markov models could be used to develop conflict measures based on event similarities to historical conflicts rather than on aggregated event scores.

The file includes the paper in Postscript, MS-Word 5.1a (Macintosh) and Adobe Acrobat (.pdf) formats. It also includes the C source code for the program used to estimate the model, and pointers to the data sets used in the analysis.

An earlier version of the paper was presented in March 1997 at the annual meeting of the International Studies Association and at the "Synergy in Early Warning" Conference, Centre for Refugee Studies, York University, Toronto.

Link to Adobe .pdf version of this paper

Using Cluster Analysis to Derive Early Warning Indicators for Political Change in the Middle East, 1979-1996

Philip A. Schrodt and Deborah J. Gerner
American Political Science Review 94,4: 803-818 (December 2000)

This paper uses event data to develop an early warning model of major political change in the Levant for the period April 1979 to December 1998. Following a general review of statistical early warning research, the analysis focuses on the behavior of eight Middle Eastern actors -- Egypt, Israel, Jordan, Lebanon, the Palestinians, Syria, the United States and USSR/Russia -- using WEIS-coded event data generated from Reuters news service lead sentences with the KEDS machine-coding system.

The analysis extends earlier work (Schrodt and Gerner 1995) demonstrating that clusters of behavior identified by conventional statistical methods correspond well with changes in political behavior identified a priori. We employ a new clustering algorithm that uses the correlation between the dyadic behaviors at two points in time as a measure of distance, and identifies cluster breaks as those time points that are closer to later points than to preceding points. We also demonstrate that these data clusters begin to "stretch" prior to breaking apart; this characteristic can be used as an early-warning indicator. A Monte-Carlo analysis shows that the clustering and early warning measures perform very differently in simulated data sets having the same mean, variance, and autocorrelation as the observed data (but no cross-correlation) which reduces the likelihood that the observed clustering patterns are due to chance.

The initial analysis uses Goldstein's (1992) weighting system to aggregate the WEIS-coded data. In an attempt to improve on the Goldstein scale, we use a genetic algorithm to optimize the weighting of the WEIS event categories for the purpose of clustering. This does not prove very successful and only differentiates clusters in the first half of the data set, a result similar to one we obtained using the cross-sectional K-Means clustering procedure. Correlating the frequency of events in the twenty-two 2-digit WEIS categories, on the other hand, gives clustering and early warning results similar to those produced by the Goldstein scale. The paper concludes with some general remarks on the role of quantitative early warning and directions for further research.

Paper originally presented at the American Political Science Association, San Francisco, 28 August - 1 September 1996

Link to Adobe .pdf file of the original paper

The Statistical Characteristics of Event Data

Philip A. Schrodt
International Interactions 20,1-2: 35-53

This paper explores event data as an abstract statistical object. It briefly traces the historical development of event data, with particular attention to how nominal events have come to be used primarily in interval-level studies. A formal definition of event data and its stochastic error structure is presented. From this definition, some concrete suggestions are made for statistically compensating for misclassification and censoring errors in frequency-based studies. The paper argues for returning to the analysis of events as discrete structures. This type of analysis was not possible when event data were initially developed, but electronic information processing capabilities have improved dramatically in recent years and many new techniques for generating and analyzing event data may soon be practical.

Paper originally presented at the International Studies Association, St. Louis, March 1988

Link to Adobe .pdf file of the original paper