Statisfaction

Datasets

This page gives some links to various websites on which you can find free datasets. We use some of these datasets ourselves, for testing methods and for teaching.

Organizations

United Nations

World Bank

French public datasets (French government, INSEE)

Google Public Data Explore

General

Comprehensive Knowledge Archiving Network

Infochimps (and this blog post on Infochimps API)

DataMarket (and this blog post about rdatamarket)

datamob.org

KDnuggets: large datasets for dataminig projects

Amazon’s cloud

BuzzData

reddit datasets

Rdatasets: an archive of datasets distributed with R

Geodata

Data and Maps at GeoCommons

COW: Correlates Of Wars

Country codes: package on CRAN

Country files

France communes polygons (IGN) and this post

Population of France communes since 1062

osmar-package : geographic elements of OpenStreetMap via its API.

London transport data for this kind of map with ggplot2

Social network

twitteR: Twitter API within R

Unbiased samples of facebook users

Funny

Eurovision

priceofweed

Global Terrorism Database (OK it’s not funny)

Sports

Football

Tennis

All time athletics

Datasport.com

Ipitos: general races, running, triathlon…

Orienteering

Blogs

Guardian

Information is Beautiful

Open Knowledge Foundation

Washington Post

Prediction competitions

Kaggle

Scientific collaborations

DBLP Bibliography

R packages for dealing with data

Stackexchange : Data APIs/feeds available as packages in R

This interesting blog post about various methods to get data directly from R

And this one too

RGoogleTrends, RGoogleDocs and this blog post:  How to use a Google Spreadsheet as data in R

New York Times: RNYTimes R package

4 Responses

Subscribe to comments with RSS.

  1. [...] a collection of datasets from OECD, IMF, Eurostat, World Bank, US Census Bureau, etc (cf. our datasets page as well). It is possible to plot these data directly online, with the following (limited) types: [...]

  2. Jeremy_G said, on 28 September 2011 at 10:44

    Il y a une petite erreur dans le lien qui mène au site de l’INSEE : http://statisfaction.wordpress.com/datasets/www.insee.fr

  3. [...] have added the link in our datasets page. LD_AddCustomAttr("AdOpt", "1"); LD_AddCustomAttr("Origin", "other"); [...]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 50 other followers

%d bloggers like this: