Search Button
RSS icon Sort by:
Implement a Machine Learning Algorithm in 2hrs
Implement a Machine Learning Algorithm in 2hrs
by H20.ai July 20, 2013 Uncategorized

We will take a simple yet popular & powerful math algorithm such as Linear Regression and implement a distributed version in 2hrs. Pre-requisites: Knowledge of Java or R See: http://h2o.0xdata.com/ Warning: Only software programmers ignore Warnings! 🙂 That said, this seriously is a very hands on java-intense exercise. Extinguished engineers may not enjoy the proceedings. […]

Read More
glmmodel table
GLM Bells and Whistles Part 2: Analysis and Results from Million Songs Data
by H20.ai July 15, 2013 Uncategorized

Using the Million Songs Data we want to characterize a subset of the songs. To do this we’re going to run a binomial regression in H2O’s GLM.  The approach to characterizing songs from the 90’s is the same method you can apply to your own data to characterize your customers relative to some larger group. In […]

Read More
glmmodel table
GLM and K means to find Social Response Bias – Dating and Fibbers
by H20.ai July 12, 2013 Uncategorized

In any field where data collection is dependent on what your clients, customers, public, whomever …. tell you, there’s the risk that people are big fat fibbers. This often happens because people respond they way they think they SHOULD rather than with their own personal truths. Social sciences and marketing people call this phenomenon social […]

Read More
The MillionSongs Data Part 1: Bells and Whistles of GLM in H2O
The MillionSongs Data Part 1: Bells and Whistles of GLM in H2O
by H20.ai July 9, 2013 Uncategorized

Using the Million Songs Data Set I want to go from beginning to end through H2O's GLM tool. Note that the original data are large, so downloading and fiddling with the full data set can be quite painful if you just do it from your desktop, that said you can find it here.  It’s a good […]

Read More
bart-simpson-generator-GLM
Running analysis on the right data!
by H20.ai July 9, 2013 Uncategorized

All in the day: Anqi Fu, our wickedly smart Math & Data Science hacker-intern from Stanford this summer, was characterizing GLMNet in R on sparse data and comparing with other tools. We were using a data sets predicting Two Bedroom median rent based on neighborhoods from huduser.org. DATA: http://www.huduser.org/portal/datasets/fmr/CensusRentData/index.html She found the analysis brisk and […]

Read More
Building A TB-Scale Math Platform @ Uberconf 2013, Denver
Building A TB-Scale Math Platform @ Uberconf 2013, Denver
by H20.ai July 9, 2013 Uncategorized

Building A TB-Scale Math Platform Datasets have gotten to PB-scale, but the modeling you can do has been limited to a single-node (e.g. R, SAS) or stuck inside the database or takes hours on Hadoop-like technologies. We have built a simple clustering package, and are using it to do distributed analytics on the sum of […]

Read More
Hands-on Data Science with H2O at GlobalBigDataConference
Hands-on Data Science with H2O at GlobalBigDataConference
by H20.ai July 9, 2013 Uncategorized

Experience a hands-on hack data session using H2O & R at BigDataBootCamp by GlobalBigDataConference. Every few months, Sridhar puts together a content-rich conference filled with highly engaged audience. This weekend Globalbigdataconference is doing a BigDataBootCamp – Tickets are on sale. Sri brings H2O & R to this audience, munging a couple of datasets for insight. […]

Read More
Data Science is NOT Rocket Science – H2O at Big Data Cloud
Data Science is NOT Rocket Science – H2O at Big Data Cloud
by H20.ai July 9, 2013 Uncategorized

DJ Das brings Sri to talk about H2O by 0xdata to the Big Data Cloud Meetup July 10, 2013. Venue: 3200 Coronado Drive, Santa Clara

Read More
1 52 53 54 56