Chapter 2 Data sources
UCDP Georeferenced Event Dataset (GED) Global version 19.1 from UCDP Dataset (https://ucdp.uu.se/downloads/index.html#ged_global) is used in our project, this dataset covers “individual events of organized violence (phenomena of lethal violence occurring at a given time and place)” in a world-wide perspective [source].
There are 152,616 records in the dataset, and each record in the dataset is an event of a specific conflict. An event of a conflict is defined as “An incident where armed force was used by an organized actor against another organized actor, or against civilians.” [source].
We simply use the term “conflict” in our project to express the “event of a conflict” or “event” in the original terminology. This dataset contains data of armed conflict events from a world-wide perspective, from 1989 to 2018; includes the time, geographic location, type, casualties, and other information regarding the event of conflict.
There are a total of 43 variables in the original dataset, and for the full definitions and explanations of variables, please refer to the dataset’s codebook [source]. However, there are some repeated variables. For example, there are several variables to represent the geo-location of one conflicts, they refer to the same location, but in different formats to suit different geographic databases. In our study, for the location variables, we are only interested in the vanilla latitude
and longitude
variables.
For the propose of analyzing conflicts’ data patterns in this project, we mostly focus on a subset of variables; the following are the variables and their explanations. Explanations are referred from the codebook of the dataset [source].
Variable | Explanation |
---|---|
id | An unique ID identifying an event of the conflict |
date_start | The earliest possible date when the conflict has taken place |
date_end | The last possible date when the conflict has taken place |
year | The year of the conflict happened |
type_of_violence | 1: state-based conflict (involved with a government of a state or any opposition organization or alliance of organizations); 2: non-state conflict (between organized armed groups); 3: one-sided violence (against civilians) |
side_a/side_b | The name of the two sides |
longitude/latitude | Best estimated or the actual location of the conflict takes place |
country | Name of the country of the conflict takes place |
region | Africa, Americas, Asia, Europe, Middle East |
death_* | Best estimate of fatalities for the specific group |
best | Best estimate of fatalities resulting from the conflict |
high/low | The highest/lowest estimation of the total fatalities |
One point that needs to be noticed of this dataset is “[the] [d]ata for Syria is not included in the current version” (source, p. 3). And as mentioned in the dataset’s codebook, “the data collection for Syria is ongoing but the final product is not releasable at this time with the same level of consistency and clarity as other UCDP GED data.” Thus, the actual total number of conflicts and deaths in recent years should be larger than the records in the dataset.