International Peace Research Institute, Oslo (PRIO)
Michigan State University
Paper prepared for delivery at the annual meeting of the International Studies Association, New Orleans, March 24-27, 2002. An earlier version of this paper was presented at the Economic Research Seminar, ISS, the Hague, Netherlands, March 7, 2002. This paper is a product of the UNU/WIDER project 'Why Some Countries Avoid Conflict While Others Fail'. In addition, I thank the Research Council of Norway and the Development Research Group at the World Bank for their support. I also thank Arjun Bedi, Nils Petter Gleditsch, Håvard Hegre, John Mueller, Michael McGinnis, Mansoob Murshed, and Håvard Strand for their useful comments.
Research on civil war and armed civil conflict has exploded in the last few years. Quantitative analyses of civil wars, in particular, have flourished. This essay provides an overview of the recent econometric research of civil war and the data that are used to study intrastate conflict. This essay reviews works that have examined the causes of civil war onset and duration. In addition to the problem of simply defining a civil war, researchers in this area must deal with five fundamental problems affecting data on armed civil conflict: non-independence, unmeasured heterogeneity (a general problem related to omitted-variable bias), endogeneity, and the rareness of the outcome variable. This essay also identifies controversies in the field and suggests ways to improve this research, through the development of better datasets and the use of better econometric techniques.
The end of the Cold War led some to optimistically predict an end to war. A decade later we find this optimism was premature. War persists. After the fall of the Berlin Wall, there have been well over 100 armed-conflicts, 33 of which were still active in 2000 (Wallensteen & Sollenberg, 2001). Most armed conflicts today are conducted within the boundaries of existing states, though there is often external participation and spillover into neighboring states. Civil war is the predominant form of war in the contemporary age.1 Yet, civil war is both under-theorized and lacking systematic empirical study in comparison to interstate war. Fortunately, recent work has started to address these problems. This essay provides an overview of the recent econometric research of civil war and the data that are used to study intrastate conflict. My overview tends to focus more on recent research, but this only reflects the fact that most econometric research on civil war is quite recent. I intentionally have left out a considerable number of works, most of which are purely theoretical or case studies.2 The objective of this survey is to inform someone who knows some econometrics how to go about doing quantitative research of civil conflict.
Civil war is an important problem. Indeed, civil war constitutes the most common form of war. Of the 220 armed-conflicts involving at least 25 battle casualties, fought between 1945 and 2000, 157 were intrastate compared to 42 that were interstate (Gleditsch et al., 2001).3 Over this period, the percentage of intrastate conflict as compared to all other types of conflict has grown, peaking in 1993 and 1994. The largest number of intrastate conflicts was in 1992. Since the end of the Cold War, well over 90% of all armed conflicts have been intrastate. See Figure 1.
[Figure 1 about here]
Civil war also causes horrible suffering. The consequences of civil war since World War II have been staggering. Casualty figures number in the millions. Most of these deaths have been civilian non-combatants. Several tens of millions more have been displaced by civil war. Refugees often flee across immediate borders seeking a more peaceful setting only to find that their presence in large numbers is destabilizing, which in turn can lead to more conflict and more displacement. Such humanitarian crises often pressure international organizations or Western states to become involved. The large influx of refugees from Haiti resulted in the US intervention in 1992 while NATO intervention in the wars in the former Yugoslavia was designed to stem the flow of refugees and limit the scope of the war (de Soysa & Gleditsch, 1999: 9).
Tremendous economic costs are also associated with civil war. Armed combat destroys capital. Buildings and bridges are literally blown up. Roads are mined or made impassable. Agricultural land and crops are destroyed. Civil war drives out investment capital and few investors venture into warzones. Warfare also chases away labor, especially those with the best skills and training. In sum the economic consequences of civil war are dire. Given that most civil wars occur in relatively poor countries, civil war has particularly important consequences for economic development.
This essay is organized as follows. First, I review the data used to study intrastate conflict, discussing how variables are conceptualized and operationalized. Next, I provide an overview of the econometric research of civil war. The corpus of empirical study of intrastate conflict helps answer the following questions about civil war – What do we know? What might we know, but really are not certain? And what is being debated. I then examine a variety of econometric issues affecting the quantitative analysis of civil war. Then I explore areas where further work is needed and conclude with a data wish list.
Defining a civil war
Defining a civil war is not as straightforward as one would imagine. Different views as to what distinguishes civil war from other forms of organized violence have resulted in the creation of competing datasets. Nearly all researchers of agree that civil war is different from other forms of violent activity such as crime, genocide, or interstate war. A widely accepted definition of civil war is an armed conflict between two domestic parties over a contested incompatibility resulting in a number of casualties exceeding a certain threshold. The issue then becomes, what threshold?
The Uppsala University Conflict Data project distinguishes between large conflicts or wars (with annual battle-deaths exceeding 1,000) and minor conflicts (with a minimum of 25 battle-deaths) (Wallensteen & Sollenberg, 2000: 648–649). The Correlates of War (COW) project sets the threshold for a civil war at 1,000 battle deaths per year.Most statistical analyses of civil conflict have used the COW dataset or some minor variation of it.
For over three decades, the most used civil war dataset has been the one developed by the Correlates of War (COW) project. Their dataset includes 214 civil wars between 1816 and 1997. The COW criteria for intra-state wars as specified by Small & Singer in Resort to Arms, includes armed conflicts involving two intrastate combatants that resulted in at least 1,000 deaths in a single year, including civilian as well as military deaths (1982: 213).4 In the 1992 COW dataset update, the threshold was lowered to 1,000 battle deaths for the entire civil war (Singer & Small, 1994: 2; Sarkees, 2000: 129; 2001: 13). The 1,000 battle-deaths criterion has become the accepted threshold for distinguishing a war from other forms of armed violent conflict, but the framework for counting has been a subject of big debate. Indeed, the COW community is still debating whose deaths to count when tallying battle deaths.5 It is clear from this debate that there is movement to push the criteria for civil war back to the strict threshold of 1,000 battle deaths per year. Two issues are at stake: Should civilian deaths be counted along with battle deaths or should only battle deaths be counted? And, should the 1,000 death criteria apply to the entire war period or to a single year? Researchers outside the COW community have developed datasets that are based on the COW civil war dataset, but vary in some key way. Three noteworthy examples include Doyle & Sambanis (2000), Collier & Hoeffler (2001), and Fearon & Laitin (2001).
The State Failure Project (Gurr, Harff & Marshall 2000, 2001; Gurr et al., 2001) has also developed a conflict dataset connected to its study of political systems and political change (Polity). They classify conflict by type, but ascribe similar thresholds for inclusion. Each belligerent party must mobilize 1,000 or more people and an average of 100 or more fatalities per year must occur during the episode. Outside of the State Failure Task Force, few have used this data to study civil war. More research should be done comparing analysis across datasets.6
A new dataset that promises to be a major competitor with the COW project is the Armed Conflict 1946-2000 dataset, or what is better known as the Uppsala dataset (Gleditsch et al., 2001).7 This dataset extends the work of the Uppsala Conflict Data Project, updated annually since 1990, back to the end of the Second World War, thereby encompassing the period, 1946-2000.8 The threshold for inclusion in this dataset is 25 battle-related deaths per annum and it includes interstate and intrastate conflict. They also distinguish between wars and armed conflict by defining a war as a conflict that results in 1,000 battle deaths in a year. Of the 220 intrastate conflicts between 1946 and 2000, 95 have been civil wars according to the Uppsala project (Gleditsch et al., 2001).
The lower threshold of 25 has several advantages. The strict 1,000 battle deaths criterion excludes several well-known armed conflicts, most notably the Northern Ireland conflict. The lower threshold does not exclude such enduring conflicts that never cross the 1,000 deaths criterion. Another advantage the 25 battle-death criterion has over the 1,000 deaths threshold is that it helps avoid against a selection bias against small countries. Indeed, ceteris paribus, small countries are less likely to incur 1,000 casualties than countries with large populations. To make this point clearer, consider two countries, one with a population of less than 100,000 and one with 100 million. The two countries may exhibit the same propensity for rebellion, but due to the differences in population, the smaller country is less likely to cross the 1,000 battle death threshold. To estimate the severity of civil war without a selection bias, it may be better to measure it in terms of the proportion of a country’s population killed in civil war.
Yet another advantage of using a 25 death threshold is that it allows a researcher to differentiate large and small armed conflicts, using the 1,000 battle deaths threshold to distinguish between the two. The 25 battle-death criterion is also sufficiently significant to ensure against not reporting an armed intrastate conflict. Lower thresholds, say 5 battle deaths, are more likely to be not reported in certain countries, especially if the western press is preoccupied with other news somewhere else in the world. In general, casualty figures are extremely unreliable. Reports from the battlefield are often wildly inflated or under-inflated (depending on how the belligerents want their role to be perceived). Given this lack of reliability researchers have tended to want to play it safe and strictly rely on a set threshold. For example, if the Uppsala project faced a conflict candidate that is reported to have between 10 and 100 casualties, they will not list it as a minor armed conflict because it does not strictly meet the 25 battle death criterion. Ideally, the actual range of casualties would be presented in these datasets, allowing the researcher make decision as to how to use the data.
In addition to the issue regarding a battle death threshold is the issue of whose death do you count? The criterion of battle deaths is consistent with interstate conflict datasets. But with interstate wars one is dealing with the deaths of soldiers of government armies. The problem with intrastate armed conflict is that one of the sides is not representing the government. Identifying who is or is not a member of a rebel army is often difficult. Moreover, civilians often tend to be the targets of violence in civil wars (Azam, 2001; Kalyvas, 2001). Accounting for battle deaths alone underestimates the scale of violence in a civil war. The COW data include civilian casualties (though as mentioned above, this issue is being debated) and the Uppsala data do not.
Several other issues are also relevant for distinguishing armed civil conflict from other forms of intrastate violence. In particular this means excluding cases of armed repression of non-violent groups or unorganized groups even if it involves considerable numbers of civilian casualties as in genocide (Rummel, 1994).. According to the Uppsala conflict data project:
An armed conflict is a contested incompatibility that concerns government and/or territory where the use of armed force between the two parties, of which at least one is the government of a state, results in at least battle-related deaths (Gleditsch et al., 2001: Appendix 1: 22).
This definition thereby excludes armed conflict between two non-governmental armies, as well as genocides such as the massacre in Rwanda in 1994. Neither the COW data nor the Uppsala data include armed conflicts between two rival rebel groups. With regard to genocides, as long as the 1,000 death threshold is crossed, the COW project includes these conflicts. The Uppsala project does not.
Another controversy regarding data on armed conflict regards distinguishing between different types of war. The COW project distinguishes between three kinds of war, interstate, intrastate, and extrasystemic (which are defined as wars between nation-states and political units not identified as being part of the international system, e.g. colonies, dependent territories). Fearon & Laitin (2001) choose to integrate aspects of both the COW civil war and extrasystemic datasets. They argue that decolonialization wars are civil wars and not substantially different than secessionist civil wars. Their data on civil war therefore include what the COW project defines as civil wars as well as extrasystemic wars. The Uppsala dataset, of course, puts all three types in one dataset, but allows the researcher to distinguish between the types of conflict. (See Figure 1).
The problem of distinguishing different types of war becomes especially thorny when the sides fighting disagree as to whether the war is intrastate or interstate. The conflicts between Serbia and Croatia/Bosnia or between the US and the Confederate States of America serve as examples. While neither the US nor Serbia recognized these seceding states as legitimate, there were international actors that did. Such cases lead one to question whether the definition of civil war is determined by successful secession? The internationalization of a civil war due to external intervention also can create problems. Indeed the prospects of a third party (an external nation-state) intervening in a civil war may alter the decision of a potential rebel leader to actually initiate conflict.
When is the war over? War termination is another particularly difficult problem. The Uppsala dataset defines the conflict on the basis of a annual 25 battle death criteria, but supplements this with a precise date of war onset and termination. War termination is thus defined in terms of the resolution of the incompatibility that served as the basis for the conflict. Such precise dating is particularly important for duration analysis of civil conflict and for the use of proportional hazard models to assess the probabilities of war onset or termination/settlement. The COW project also provides a date associated with the end of a war. The particular date of conflict onset and termination can be quite subjective. The signing of a peace treaty or the defeat of one of the armies can serve as a date, but not all war ends have such clear markings.
Where is the war located? Most conflict datasets only indicate what nation-state or nation-states are involved in conflict. The COW data are organized around the concept of a country being at war or not. As such, coterminous wars are not distinguished. One of the chief innovations of the Uppsala conflict dataset is that the conflict itself serves as the unit of analysis so that two conflicts being fought in one country are identified as such. A further innovation associated with the PRIO partnership with the Uppsala conflict data project has been to provide data regarding the geographic location of each case of armed conflict (Buhaug & Gates, 2002). Such data may be valuable for understanding the diffusion of conflict and war.