Public Data Sets

For your data analysis pleasure, I give you a giant list of super cool publicly available data. If you’re looking at the data sets and wondering “now what?” – you can find this list AND tutorials on how to use H2O for analysis at the H2O docs page (here:

You can also get a detailed hands on experience analyzing any of this data, random numbers you might have laying around, stuff you made up, or whatever you want by coming to any of our upcoming meetups and hanging out with the 0xdata math team ( 

Open City Datasets

**Palo Alto Open Data


20 yrs crime data


Rents & Neighborhoods
Transportation and Travel

Airlines Dataset – but so far it contains years 1987-2007 (based on

Data source:

Open Flights Database

Capital Bikes Share Data
Sciences and Engineering

NASA Open Data

Seismic Data

Weather Public Data
Diverse Data Sets

Many Eyes Community Datasets

Kaggle Competitions

UCI Machine Learning Library

Human Activity Recognition Using Smartphones

MLData repository

GitHub Challenge

Yelp Dataset Challenge

Netflix Prize



Stanford Dataset Library

Million Songs Database

Public Policy Data

European Open Data

US Open Data



WorldBank Data

Guardian Data

Statistics Netherlands

Quandl 6M Financial, Economics, and Social Datasets