Happy Holidays from H2O.ai

Dear Community,

Your intelligence, support and love have been the strength behind an incredible year of growth, product innovation, partnerships, investments and customer wins for H2O and AI in 2017. Thank you for answering our rallying call to democratize AI with our maker culture.

Our mission to make AI ubiquitous is still fresh as dawn and our creativity new as spring. We are only getting started, learning, rising from each fall. H2O and Driverless AI are just the beginnings.

As we look into 2018, we see prolific innovation to make AI accessible to everyone. Simplicity that opens scale. Our focus on making experiments faster, easier and cheaper. We are so happy that you will be the center of our journey. We look forward to delivering many more magical customer experiences.

On behalf of the team and management at H2O, I wish you all a wonderful holiday: deep meaningful time spent with yourself and your loved ones and to come back refreshed for a winning 2018!

Gratitude for your partnership in our beautiful journey – it’s just begun!

this will be fun,


Sri Ambati
CEO & Co-Founder

P.S. #H2OWorld was an amazing experience. I invite you to watch the keynote and more than 40 talks and conversations.

H2O + TensorFlow on AWS GPU

TensorFlow on AWS GPU instance
In this tutorial, we show how to setup TensorFlow on AWS GPU instance and run H2O Tensorflow Deep learning demo.

Pre-requisites:
To get started, request an AWS EC2 instance with GPU support. We used a single g2.2xlarge instance running Ubuntu 14.04.To setup TensorFlow with GPU support, following softwares should be installed:

  1. Java 1.8
  2. Python pip
  3. Unzip utility
  4. CUDA Toolkit (>= v7.0)
  5. cuDNN (v4.0)
  6. Bazel (>= v0.2)
  7. TensorFlow (v0.9)

To run H2O Tensorflow Deep learning demo, following softwares should be installed:

  1. IPython notebook
  2. Scala
  3. Spark
  4. Sparkling water

Software Installation:
Java:


#To install Java follow below steps: Type ‘Y’ on installation prompt
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update 
sudo apt-get install oracle-java8-installer
Update JAVA_HOME in ~/.bashrc 

#Add JAVA_HOME to PATH: 
export PATH=$PATH:$JAVA_HOME/bin 

# Execute following command to update current session: 
source ~/.bashrc 

#Verify version and path: 
java -version 
echo $JAVA_HOME

Python:


#AWS EC2 instance has Python installed by default. Verify if Python 2.7 is installed already:
python -V 

#Install pip 
sudo apt-get install python-pip 

#Install IPython notebook 
sudo pip install "ipython[notebook]" 

#To run H2O example notebooks, execute following commands: 
sudo pip install requests 
sudo pip install tabulate 

Unzip utility:


#Execute following command to install unzip
sudo apt-get install unzip

Scala:


#Follow below mentioned steps: Type ‘Y’ on installation prompt
sudo apt-get install scala 

#Update SCALA_HOME in ~/.bashrc and execute following command to update current session: 
source ~/.bashrc 

#Verify version and path: 
scala -version 
echo $SCALA_HOME 

Spark:


#Java and Scala should be installed before installing Spark. 
#Get latest version of Spark binary:
wget http://apache.cs.utah.edu/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgz

#Extract the file: 
tar xvzf spark-1.6.1-bin-hadoop2.6.tgz 

#Update SPARK_HOME in ~/.bashrc and execute following command to update current session: 
source ~/.bashrc 

#Add SPARK_HOME to PATH: 
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin 

#Verify the variables: 
echo $SPARK_HOME

Sparkling Water:


#Latest Spark pre-built for Hadoop should be installed and point SPARK_HOME to it:
export SPARK_HOME="/path/to/spark/installation"

#To launch a local Spark cluster with 3 worker nodes with 2 cores and 1g per node, export MASTER variable
export MASTER="local-cluster[3,2,1024]"

#Download and run Sparkling Water
wget http://h2o-release.s3.amazonaws.com/sparkling-water/rel-1.6/5/sparkling-water-1.6.5.zip
unzip sparkling-water-1.6.5.zip
cd sparkling-water-1.6.5
bin/sparkling-shell --conf "spark.executor.memory=1g"

CUDA Toolkit:


#In order to build or run TensorFlow with GPU support, both NVIDIA’s Cuda Toolkit (>= 7.0) and cuDNN (>= v2) need to be installed. 
#To install CUDA toolkit, run:
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1410/x86_64/cuda-repo-ubuntu1410_7.0-28_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1410_7.0-28_amd64.deb
sudo apt-get update
sudo apt-get install cuda

cuDNN:

 
#To install cuDNN, download a file named cudnn-7.0-linux-x64-v4.0-prod.tgz after filling NVIDIA questionnaire. 
#You need to transfer it to your EC2 instance’s home directory.
tar -zxf cudnn-7.0-linux-x64-v4.0-prod.tgz &&
rm cudnn-7.0-linux-x64-v4.0-prod.tgz
sudo cp -R cuda/lib64 /usr/local/cuda/lib64 
sudo cp ~/cuda/include/cudnn.h /usr/local/cuda

#Reboot the system 
sudo reboot

#Update environment variables as shown below:
export CUDA_HOME=/usr/local/cuda 
export CUDA_ROOT=/usr/local/cuda 
export PATH=$PATH:$CUDA_ROOT/bin 
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_ROOT/lib64

Bazel:


#To instal Bazel(>= v0.2), run:
sudo apt-get install pkg-config zip g++ zlib1g-dev
wget https://github.com/bazelbuild/bazel/releases/download/0.3.0/bazel-0.3.0-installer-linux-x86_64.sh
chmod +x bazel-0.3.0-installer-linux-x86_64.sh
./bazel-0.3.0-installer-linux-x86_64.sh --user

TensorFlow:


#Download and install TensorFlow:
wget https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp27-none-linux_x86_64.whl
sudo pip install --upgrade tensorflow-0.9.0rc0-cp27-none-linux_x86_64.whl 

#Configure TF with GPU support enabled using: 
./configure

To build TensorFlow, run:


bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install --upgrade /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-any.whl

Run H2O Tensorflow Deep learning demo:


#Since, we want to open IPython notebook remotely, we will use IP and port option. To start TensorFlow notebook:
cd sparkling-water-1.6.5/ 
IPYTHON_OPTS="notebook --no-browser --ip='*' --port=54321" bin/pysparkling #Note that port specified in above command should be open in the system. Open http://PublicIP:8888 in browser to start IPython notebook console. Click on TensorFlowDeepLearning.ipynb Refer this video for demo details. #Sample .bashrc contents: export JAVA_HOME=/usr/lib/jvm/java-8-oracle export SCALA_HOME=/usr/share/java export SPARK_HOME=/home/ubuntu/spark-1.6.1-bin-hadoop2.6 export MASTER="local-cluster[3,2,1024]" export PATH=$PATH:$JAVA_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin export CUDA_HOME=/usr/local/cuda export CUDA_ROOT=/usr/local/cuda export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/jvm/java-8-oracle/bin:/home/ubuntu/spark-1.6.1-bin-hadoop2.6/bin:/home/ubuntu/spark-1.6.1-bin-hadoop2.6/sbin:/usr/local/cuda/bin:/home/ubuntu/bin export LD_LIBRARY_PATH=:/usr/local/cuda/lib64

Troubleshooting:
1) ERROR: Getting java.net.UnknownHostException while starting spark-shell
Solution:
Make sure /etc/hosts has entry for hostname.
Eg: 127.0.0.1 hostname

2) ERROR: Getting Could not find .egg-info directory in install record error during IPython installation
Solution:

sudo pip install --upgrade setuptools pip

3) ERROR: Can’t find swig while configuring TF
Solution:

sudo apt-get install swig

4) ERROR: “Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5”
Solution:
Specify 3.0 while configuring TF at:
Please note that each additional compute capability significantly increases your build time and binary size.

5) ERROR: Could not insert ‘nvidia_352’: Unknown symbol in module, or unknown parameter (see dmesg)
Solution:

sudo apt-get install linux-image-extra-virtual

6) ERROR: Cannot find ’./util/python/python_include
Solution:

sudo apt-get install python-dev

7) Find Public IP address of system
Solution:

curl http://169.254.169.254/latest/meta-data/public-ipv4

Demo Videos