Focus

———- Forwarded message ———
From: SriSatish Ambati
Date: Thu, Sep 15, 2016 at 10:17 PM
Subject: changes and all hands tomorrow.
To: team

Team,

Our focus has changed towards larger fewer deals & deeper engagements with handful of finance and insurance customers.

We took a hard look at our marketing spend, pr programs and personnel. We let go most of our amazing inside sales talent. And two of our account executives. We are not building a vertical in IOT. In all nine business folks were affected. No further changes are anticipated or necessary.

These were heroic partners in our journey. I spoke to most all of them today to personally convey the message. Some were with me for a short time, many for years – all great humans who diligently served me and h2o well. I’m grateful for their support and partnership towards my vision. I learnt a lot from each one of them and will not hesitate to assist them any ways possible personally.

Thank you. heroes in bcc. may you find fulfillment & love in your path. It’s a small world and we will all meet very soon.

Our goal as a startup is…

Distracted Driving

Last week, we started to examine the 7.2% increase in traffic fatalities from 2014 to 2015, the reversal of a near decade-long downward trend. We then broke out the data by various accident classifications, such as “speeding” or “driving with a positive BAC,” and identified those classifications that had the greatest increase. One label that showed promise for improvement was “involving a distracted driver.” According to Pew Research, the number of Americans who own a mobile device has pretty consistently risen over the past decade, as has the number of Americans who own a smartphone. Moreover, apps like Pokemon Go have built-in features that incentivize driving while playing, and these types of augmented reality games are only going to become more common.

The National Highway Traffic Safety Association (NHTSA) defines distracted driving as “any activity that could divert a person’s attention away from the primary task of driving.” This includes several activities, from texting while driving, to using one hand to place a call. The Governor’s Highway Safety Administration (GHSA), an organization that “provides leadership and representation for the states and territories to improve traffic safety,” notes that states can even collect data on distracted driving…

Introducing H2O Community & Support Portals

At H2O, we enjoy serving our customers and the community, and we take pride in making them successful while using H2O products. Today, we are very excited to announce two great platforms for our customers and for the community to better communicate with H2O. Let’s start with our community first:

Community Badge

The success of every open source project depends on a vibrant community, and having an active community helps to convert an average product into a successful product. So to maintain our commitment to our H2O community, we are releasing an updated community platform at https://community.h2o.ai. This community platform is available for everyone, whether you are new to machine intelligence or are a seasoned veteran. If you are new to machine intelligence or H2O, you have an opportunity to learn from great minds, and if you are a seasoned industry veteran, you can not only enhance your skillset, you can also help others to achieve success.

Our objective is to develop this community in a way where every community member has the opportunity to establish himself or herself as a technology leader or expert by helping others. Every moment you spend here in the community,…

Fatal Traffic Accidents Rise in 2015

On Tuesday, August 30th, the National Highway Traffic Safety Administration released their annual dataset of traffic fatalities asking interested parties to use the dataset to identify the causes of an increase of 7.2% in fatalities from 2014 to 2015. As part of H2O.ai‘s vision of using artificial intelligence for the betterment of society we were excited to tackle this problem.

screen-shot-2016-09-14-at-2-29-27-pm

This post is the first in our series on the Department of Transportation dataset and driving fatalities which will hopefully culminate in a hackathon in late September, where we’ll invite community members to join forces with the talented engineers and scientists at H2O.ai to find a solution to this problem and prescribe policy changes.


To begin, we started by reading some literature and getting familiar with the data. These documents served as excellent inspiration for possible paths of analysis and guided our thinking. Our introductory investigation was based around asking a series of questions, paving the way for detailed analysis down the road. The dataset includes every (reported) accident along with several labels, from…

IoT – Take Charge of Your Business and IT Insights Starting at the Edge

Instead of just being hype, the Internet of Things (IoT) is now becoming a reality. Gartner forecasts that 6.4 billion connected devices will be in use worldwide, and 5.5 million new devices will get connected every day, in 2016. These devices range from wearables, to sensors in vehicles the can detect surrounding obstacles, to sensors in pipelines that detect their own wear-and-tear. Huge volumes of data are collected from these connected devices, and yet companies struggle to get optimal business and IT outcomes from it.

Why is this the case?
Rule-based data models limit insights. Industry experts have a wealth of knowledge manually driving business rules, which in turn drive the data models. Many current IoT practices simply run large volumes of data through these rule-based models, but the business insights are limited by what rule-based models allow. Machine Learning/Artificial Intelligence allows new patterns to be found within stored data without human intervention. These new patterns can be applied to data models, allowing new insights to be generated for better business results.
Analytics in the backend data center delay insights. In current IoT practice, data is collected and analyzed in the backend data center (e.g. OLAP/MPP…

H2O + TensorFlow on AWS GPU

TensorFlow on AWS GPU instance
In this tutorial, we show how to setup TensorFlow on AWS GPU instance and run H2O Tensorflow Deep learning demo.

Pre-requisites:
To get started, request an AWS EC2 instance with GPU support. We used a single g2.2xlarge instance running Ubuntu 14.04.To setup TensorFlow with GPU support, following softwares should be installed:

  1. Java 1.8
  2. Python pip
  3. Unzip utility
  4. CUDA Toolkit (>= v7.0)
  5. cuDNN (v4.0)
  6. Bazel (>= v0.2)
  7. TensorFlow (v0.9)

To run H2O Tensorflow Deep learning demo, following softwares should be installed:

  1. IPython notebook
  2. Scala
  3. Spark
  4. Sparkling water

Software Installation:
Java:

 #To install Java follow below steps: Type ‘Y’ on installation prompt sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer Update JAVA_HOME in ~/.bashrc #Add JAVA_HOME to PATH: export PATH=$PATH:$JAVA_HOME/bin # Execute following command to update current session: source ~/.bashrc #Verify version and path: java -version echo $JAVA_HOME 

Python:

 #AWS EC2 instance has Python installed by default. Verify if Python 2.7 is installed already: python -V #Install pip sudo apt-get install python-pip #Install IPython notebook sudo pip install "ipython[notebook]" #To run H2O example notebooks, execute following commands: sudo pip...

H2O GBM Tuning Tutorial for R

In this tutorial, we show how to build a well-tuned H2O GBM model for a supervised classification task. We specifically don’t focus on feature engineering and use a small dataset to allow you to reproduce these results in a few minutes on a laptop. This script can be directly transferred to datasets that are hundreds of GBs large and H2O clusters with dozens of compute nodes.

This tutorial is written in R Markdown. You can download the source from H2O’s github repository.

A port to a Python Jupyter Notebook version is available as well.

Installation of the H2O R Package

Either download H2O from H2O.ai’s website or install the latest version of H2O into R with the following R code:

# The following two commands remove any previously installed H2O packages for R. if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) } if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") } # Next, we download packages that H2O depends on. pkgs <- c("methods","statmod","stats","graphics","RCurl","jsonlite","tools","utils") for (pkg in pkgs) { if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) } } # Now we download, install and initialize the H2O package for R. install.packages("h2o", type="source", repos=(c("http://h2o-release.s3.amazonaws.com/h2o/rel-turchin/8/R"))) 

Launch an H2O…

Hyperparameter Optimization in H2O: Grid Search, Random Search and the Future

“Good, better, best. Never let it rest. ‘Til your good is better and your better is best.” – St. Jerome

tl;dr

H2O now has random hyperparameter search with time- and metric-based early stopping. Bergstra and Bengio[1] write on p. 281:

Compared with neural networks configured by a pure grid search, we find that random search over the same domain is able to find models that are as good or better within a small fraction of the computation time.

Even smarter means of searching the hyperparameter space are in the pipeline, but for most use cases random search does as well.

What Are Hyperparameters?

Nearly all model algorithms used in machine learning have a set of tuning “knobs” which affect how the learning algorithm fits the model to the data. Examples are the regularization settings alpha and lambda for Generalized Linear Modeling or ntrees and max_depth for Gradient Boosted Models. These knobs are called hyperparameters to distinguish them from internal model parameters, such as GLM’s beta coefficients or Deep Learning’s weights, which get learned from the data during the model training process.

What Is Hyperparameter Optimization?

The set…

Spam Detection with Sparkling Water and Spark Machine Learning Pipelines

This short post presents the “ham or spam” demo, which has already been posted earlier by Michal Malohlava, using our new API in latest Sparkling Water for Spark 1.6 and earlier versions, unifying Spark and H2O Machine Learning pipelines. It shows how to create a simple Spark Machine Learning pipeline and a model based on the fitted pipeline, which can be later used for prediction whether a particular message is spam or not.

Before diving into the demo steps, we would like to provide some details about the new features in the upcoming Sparkling Water 2.0:

  • Support for Apache Spark 2.0 and backwards compatibility with all previous versions.
  • The ability to run Apache Spark and Scala through H2O’s Flow UI.
  • H2O feature improvements and visualizations for MLlib algorithms, including the ability to score feature importance.
  • Visual intelligence for Apache Spark.
  • The ability to build Ensembles using H2O plus MLlib algorithms.
  • The power to export MLlib models as POJOs (Plain Old Java Objects), which can be easily run on commodity hardware.
  • A toolchain for ML pipelines.
  • Debugging support for Spark pipelines.
  • Model and data governance through Steam.
  • Bringing H2O’s powerful data munging capabilities to Apache Spark.
  • In order to run the code…

    Interview with Carolyn Phillips, Sr. Data Scientist, Neurensic

    During Open Tour Chicago we conducted a series of interviews with data scientists attending the conference. This is the second of a multipart series recapping our conversations.

    Be sure to keep an eye out for updates by checking our website or following us on Twitter @h2oai.

    AAEAAQAAAAAAAAeRAAAAJGZmMWZiMGE1LTVlMDgtNGQwZi05NzYyLTEwMTMxNDhmODcwMw

    H2O.ai: How did you become a data scientist?

    Phillips: Until very close to two months ago I was a computational scientist working at Argonne National Laboratory.

    H2O.ai: Okay.

    Phillips: I was working in material science, physics, mathematics, etc., but I was getting bored with that and looking for new opportunities, and I got hired by a startup in Chicago.

    H2O.ai: Yes.

    Phillips: When they hired me they said, “We’re hiring you, and your title is Senior Quantitative Analyst,” but the very first day I showed up, they said, “Scratch that. We’ve changed your title. Your title is now Senior Data Scientist.” And I said, “Yes, all right.” It has senior in it, so I’m okay going with that.

    H2O.ai: Nice. I like it.

    Phillips:…