Valentine Day and lonely people in France

Insee published recently a paper (in French), well in line with the Valentine Day, which characterizes people living alone or in couple by socio-professional category, along with the data. Between 1990 and 2008 (two population surveys), the proportion of people living alone mostly increased for people under 60. After 60, 38% of women live alone,Continue reading “Valentine Day and lonely people in France”

Daily casualties in Syria

Every new day brings its statistics of new deaths in Syria… Here is an attempt to learn about the Syrian uprising by the figures. Data vary among sources: the Syrian opposition provides the number of casualties by day (here on Dropbox), updated on 8 February 2012, with a total exceeding 8 000. We note firstContinue reading “Daily casualties in Syria”

Create maps with maptools R package

Baptiste Coulmont explains on his blog how to use the R package maptools. It is based on shapefile files, for example the ones offered by the French geography agency IGN (at départements and communes level). Some additional material like roads and railways are provided by the OpenStreetMap project, here. For the above map, you needContinue reading “Create maps with maptools R package”

Google Fusion Tables

A quick post about another Google service that I discovered recently called Fusion Tables. There you can store, share and visualize data up to 250 MB, of course in the cloud. With Google Docs, Google Trends and Google Public Data Explore, it is another example of Google’s efforts to gain ground in data management. HasContinue reading “Google Fusion Tables”

France open data at

Today is launched the (beta version of the) brand new French website for open data, at (do not misunderstand the url, it is in French!). On prime minister’s initiative, it collects data from various ressources, among which the institute for statistics INSEE, most of the ministries (Finance, Culture, etc), several big companies (like theContinue reading “France open data at”

Triathlon data with ggplot2

As Jérôme and I like so much to play with triathlon data, it is a pleasure to see that we are not alone. Christophe Ladroue, from the university of Bristol, wrote this post yesterday: An exercise in plyr and ggplot2 using triathlon results, followed by part II, way better than ours, here and here. For example,Continue reading “Triathlon data with ggplot2”

World Tourism Day, and Google Public Data Explore

Today is the World Tourism Day! So let’s speak about some tourism related datasets – and others. Among other nice functions, Google offers a Public Data Explore in a beta version which provides a collection of datasets from OECD, IMF, Eurostat, World Bank, US Census Bureau, etc (cf. our datasets page as well). It is possibleContinue reading “World Tourism Day, and Google Public Data Explore”

Power of running world records

Following a few entries on sports here and there, I was wondering what kind of law follow the running records with respect to the distance. The data are available on Wikipedia, or here for a tidied version. It collects 18 distances, from 100 meters to 100 kilometers. A log-log scale is in order: It isContinue reading “Power of running world records”

Triathlon in three colors

With Jérôme Lê we are planning to swim/bike/run Paris triathlon next July. Before begining the trainning, we want to know where to concentrate efforts. Let us look at some data. The race distance is known as Intermediate, or Standard, or Olympic distance, with 1.5 km swim, 40 km ride and 10 km run. Data for 2010 Open raceContinue reading “Triathlon in three colors”

Random Colours (part 2)

In this previous post, Julyan presented the paintings of Gerhard Richter, and asked whether the colours were really “randomly chosen”, as claimed by the painter. To answer the question from a statistical point of view (ie whether the colours are uniformly distributed in the (r,g,b) space or in the (x, y, r, g, b) spaceContinue reading “Random Colours (part 2)”