# Statisfaction

## Daily casualties in Syria

Posted in Dataset, R by Julyan Arbel on 9 February 2012

Every new day brings its statistics of new deaths in Syria… Here is an attempt to learn about the Syrian uprising by the figures. Data vary among sources: the Syrian opposition provides the number of casualties by day (here on Dropbox), updated on 8 February 2012, with a total exceeding 8 000.

We note first that the attacks accelerate, as the cumulated graph is mostly convex (click to enlarge):

Plotting the numbers by day shows the bloody situation of  Fridays, a gathering day in the Muslin calendar. This point was especially true at the beginning of the uprising, but lately any other day can be equally deadly:

There are almost twice as much deaths on Fridays as any other day in average:
Here are boxplots for the logarithm of daily casualties by day of the week:

and their density estimates, first coloured by day of the week, then by Friday vs rest of the week:

Here is the code (with clumsy parts for fitting the data frames for ggplot, do not hesitate to comment on it)

```library(ggplot2)
input\$LogicalFriday=factor(input\$WeekDay =="Friday",levels = c(FALSE, TRUE),
labels = c("Not Friday", "Friday"))
input\$Date=as.Date(input\$History,"%d/%m/%Y")
input\$WeekDays=factor(input\$WeekDay,
levels=unique(as.character(input\$WeekDay[7:13]))) # trick to sort the legend
qplot(x=Date,y=cumsum(Number), data=input, geom="line",color=I("red"),xlab="",ylab="",lwd=I(1))
qplot(x=as.factor(Date),y=Number, data=input, geom="bar",fill=LogicalFriday,xlab="",ylab="")
qplot(log(Number+1), data=input, geom="density",fill=LogicalFriday,xlab="",ylab="",alpha=I(.2))
qplot(log(Number+1), data=input, geom="density",fill=WeekDay,xlab="",ylab="",alpha=I(.2))
qplot(WeekDays,log(Number+1),data=input,geom="boxplot",xlab="",ylab="",colour=WeekDays)```

Created by Pretty R at inside-R.org

Tagged with: ,

### 14 Responses

1. Pierre Jacob said, on 10 February 2012 at 03:00

“fill” would have been better than “colour” for the bar plots, don’t you think?

• Julyan Arbel said, on 10 February 2012 at 10:59

You’re right, I updated the first bar plot. I meant the use of “colour” for the seven days bar plot, as “fill” displays a too much rainbowie graph !

• Julyan Arbel said, on 10 February 2012 at 11:19

2. Chris Skedgel said, on 10 February 2012 at 16:46

Very nice presentation of a horrifying subject.

Would you mind posting the code for the graphs you used?

3. Niall Bolger said, on 10 February 2012 at 20:10

I agree with the previous comment–this is a superb job of visual presentation. It would be very nice to see how you produced the results.

4. Achim Zeileis said, on 11 February 2012 at 14:39

Interesting data. To track the evolution of the average number of deaths over time as well as the extent of this “Friday effect” over time, I tested and dated structural breaks using a geometric count data model. It appears that on average the daily number of deaths over time increases (with a short calmer period in summer). But the Friday effect decreases and turns non-significant towards the end of the series. It’s of course just a quick first analysis but my naive interpretation would be that there was a shift from a “pulsing” deterrence strategy with violent acts only on prominent days to a situation of constant violence.

The replication code is shared below in case anyone’s interested. Best, Z

## data
x <- scan("syria.txt", what = "character")
ix <- grep("/", x, fixed = TRUE)
library("zoo")
syria <- zoo(as.numeric(x[ix + 1]), as.Date(strptime(x[ix], format = "%d/%m/%Y")))
friday <- factor(as.POSIXlt(time(syria))\$wday == 5,
levels = c(FALSE, TRUE), labels = c("no", "yes"))
wday
## at least two breaks, possibly more
library(“strucchange”)
scus
## BIC selection yields four breakpoints
library(“fxregime”)
gbp
## average number of deaths increases, Friday effect
## decreases (non-significant on last two segments)
bf <- breakfactor(gbp, breaks = 4)
m <- glm(coredata(syria) ~ 0 + bf/friday, family = negative.binomial(1))
summary(m, dispersion = 1)
coef(m)

## visualization on log scale
plot(syria + 0.5, log = "y")
points(time(syria), fitted(m), pch = 19, col = 2, cex = 0.5)
abline(v = breakdates(gbp, breaks = 4), lty = 2)

• Julyan Arbel said, on 13 February 2012 at 18:23

Thanks Achim. Isn’t missing the definition of gbp?

5. Achim Zeileis said, on 15 February 2012 at 10:52

Yes, it misses quite a few lines of code, including the definitions of wday, scus, and gbp. Either I was completely confused or something happened when entering the code into the web mask of the browser. Below is another attempt (with small touch-ups). If it doesn’t work again, a copy of the code can also be obtained from http://eeecon.uibk.ac.at/~zeileis/SyriaCasualties.R

```## data from http://statisfaction.wordpress.com/2012/02/09/daily-casualties-in-syria/
sep = "\t", header = TRUE, stringsAsFactors = FALSE)
library("zoo")
syria <- zoo(as.numeric(x\$Number), as.Date(strptime(x\$History, format = "%d/%m/%Y")))
friday <- factor(as.POSIXlt(time(syria))\$wday == 5,
levels = c(FALSE, TRUE), labels = c("no", "yes"))
wday <- factor(as.POSIXlt(time(syria))\$wday + 1)

## visualization of week day effects
plot(log(coredata(syria) + 0.5) ~ wday)
plot(log(coredata(syria) + 0.5) ~ friday)

## geometric count data model may be useful
library("MASS")
summary(glm.nb(coredata(syria) ~ friday))

## structural change tests ->
## at least two breaks, possibly more
library("strucchange")
scus <- gefp(coredata(syria) ~ friday,
family = negative.binomial(1), order.by = time(syria))
plot(scus, aggregate = FALSE)

## breakpoint estimation ->
## BIC selection yields four breakpoints
library("fxregime")
gbp <- fxregime:::gbreakpoints(syria ~ friday,
data = data.frame(syria = coredata(syria), friday = friday),
fit = glm, family = negative.binomial(1),
order.by = time(syria), h = 20, ic = "BIC")
plot(gbp)

## fit segmented model ->
## average number of deaths increases, Friday effect
## decreases (non-significant on last two segments)
bf <- breakfactor(gbp, breaks = 4)
m <- glm(coredata(syria) ~ 0 + bf/friday, family = negative.binomial(1))
summary(m, dispersion = 1)
coef(m)

## visualization on log scale
plot(syria + 0.5, log = "y")
points(time(syria), fitted(m), pch = 19, col = 2, cex = 0.5)
abline(v = breakdates(gbp, breaks = 4), lty = 2)```

Created by Pretty R at inside-R.org

• Julyan Arbel said, on 16 February 2012 at 16:42

The problem might be that the text in comments interprets html tags (which I do not know at all). When I copy/paste the html text from Pretty R Code (http://www.inside-r.org/pretty-r/tool), it works.
Thanks for the code. Your interpretation of a shift towards a constant violence state seems right.
Julyan

6. AP Stuck at 5,400 | Farid Ghadry said, on 25 February 2012 at 17:44

[...] to counting casualties by relying on outside reliable data from different official sources. In its Daily Casualties in Syria section, it points to various data published by all the sources keeping count.For example, the [...]

7. Craig Marsden said, on 14 May 2012 at 15:05

A distressing subject – well presented – I think the use of statistics in analysing current world issues is often lacking, good to see some redressing of the balance.

• Julyan Arbel said, on 14 May 2012 at 15:31

Thanks! As Hans Rosling puts it: “Unveiling the beauty of statistics for a fact based world view“.