Change in Syria data source for PITF Atrocities Data

For most of the past four years we've been using http://syrianshuhada.com/ as a source for daily event counts for Syria. Over the past few months, updates to this site have been increasingly delayed, though as of 13-June-2016 the data had been updated to 31-March-2016, but that is still more than two months behind the rest of our data.

Consequently, we have transitioned to a new source:

http://www.vdc-sy.info/index.php/en/

which is used, for example, by The Economist and is one of the two data sources integrated into the http://syrianshuhada.com/ data. However, the definitions (and presumably methodology) of the two sources are somewhat different and, if I'm reading correctly from their earlier summary reports (see for example the May-2015 report at http://www.vdc-sy.info/index.php/en/reports/1433810787, these don't include fatalities due to the anti-regime forces except for ISIS and "the coalition forces against ISIS". However, a test with the data for July-2015 to April-2016 showed the daily counts correlate quite well (r = 0.800). The VDC counts are about 60% lower and this lower number is probably a combination

Note that both of the sources are explicitly anti-regime and syrianshuhada notes on their home page that their data is a combination of the VDC data and data from the Damascus Center for Human Rights Studies. The use of these sources does not imply any sort of endorsement by the US government.

In addition to providing information on the civilian/military status of every victim, the VDC data also provide geographical information at the province level, and causes of death (shelling, firearms, torture, etc). These codings are preserved in the new data so we now have geographically disaggregated data. Provincial coordinates are the geographical centroids given in geonames.com; the "Other" and "Unknown" locations have been assigned to points vaguely near a population sort-of centroid. The 5-death threshold used in this data set has not been applied at the individual region level (it is always exceeded on the country level), and all entries are labelled as being part of a "campaign" even if there was only a single incident for a given region/day.

In addition to making the substitution of the source, the default values on some of the fields have been updated to reflect the new data:

Two final notes: