Makers in Action: Community, Partners and Team Members at #GTC18

NVIDIA’s GPU Technology Conference (GTC) has been incredible! Folks from all over the world are exploring the latest breakthroughs in self-driving cars, smart cities, healthcare, high performance computing, virtual reality, and more, all propelled by the AI movement.

If you’re attending GTC and would like to see our solutions in action (recently named a leader by Gartner) or chat with one of the Makers, come visit us at booth #725.

We’re excited to be joined at #GTC18 by a number of community members, partners, and team members including:

Venkatesh Ramanathan (Software Engineer, PayPal) and Ajay Gopal (Chief Data Scientist, Deserve) joined speakers from KickView and Memorial Sloan Kettering Cancer Center on the “Deep Learning Insitute Executive Workshop” panel.

Arpit Mehta (Data Scientist, Product Owner: Big Data Architectures, BMW Group) presented “Now I C U: Analyzing Data Flow Inside an Autonomous Driving Car” and “Beyond Autonomous Driving: Unleashing Value via Machine Learning Applications in Automotive Industry“.

Supermicro (Booth #111) – Supermicro provides customers around the world with application-optimized server, workstation, blade, storage and GPU systems. Stop by their booth for a demo of Driverless AI. Driverless AI speeds up data science workflows by automating feature engineering, model tuning, ensembling, and model deployment.

MapD (Booth #602) – MapD’s mission is to redefine the limits of scale and speed in big data analytics. Along with us, NVIDIA, and other innovative companies, they are also one of the founding GPU Open Analytics Initiative (GOAI) members. Enjoy a recent post (“How VW Predicts Churn with GPU-Accelerated Machine Learning and Visual Analytics”) by Wamsi Viswanath (Data Scientist, MapD) in which he shares how H2O was used in concert with MapD.

Dell EMC (Booth #815) – Dell EMC enables digital transformation with trusted solutions for the modern data center.

Team Members
– Wednesday, March 28th, 9am – Ashrith Barthur (Security Scientist) presented “Network Security with Machine Learning”.
– Thursday, March 29th, 11am, Room 211A – Jon McKinney (Director of Research) will be presenting “World’s Fastest Machine Learning With GPUs”.
– Thursday, March 29th, 4pm, LL21C – Arno Candel (CTO) will host the “Hands-on with Driverless AI” workshop.

Enjoy GTC!


Director of Community

Come meet the Makers!

NVIDIA’s GPU Technology Conference (GTC) Silicon Valley, March 26-29th is the premier AI and deep learning event, providing you with training, insights, and direct access to the industry’s best and brightest. It’s where you will see the latest breakthroughs in self-driving cars, smart cities, healthcare, high-performance computing, virtual reality and more, and all because of the power of AI. will be there in full force to share how you can immediately gain value and insights from our industry-leading AI and ML platforms. In case you hadn’t heard, was named a leader in 2018 Gartner Magic Quadrant for Data Science and Machine Learning platforms. You can get the report here.

Please visit us at booth #725 to see Driverless AI in action and talk to the Makers leading the AI movement! Our sessions will be leading edge talks that you won’t want to miss.

  1. Ashrith Barthur – Network Security with Machine Learning

    Ashrith will speak about modeling different kinds of cyber attacks and building a model that is able to identify these different kinds of attacks using machine learning.

    Room 210F – Wednesday, 28 March, 9 AM to 9:50 AM.

  2. Jonathan McKinney – World’s Fastest Machine Learning with GPUs

    Jonathan will introduce H2O4GPU, a fully featured machine learning library that is optimized for GPUs with a robust python API that is a drop dead replacement for scikit-learn. He will demonstrate benchmarks for the most common algorithms relevant to enterprise AI and will showcase performance gains as compared to running on CPUs.

    Room 220B – Thursday, March 29, 11 AM to 11:50 AM.

  3. Arno Candel – Hands-on with Driverless AI

    In this lab, Arno will show how to install and start Driverless AI, the automated Kaggle Grandmaster in-a-box software, on a multi GPU box. He will go through the full end-to-end workflow and showcase how Driverless AI uses the power of GPUs to achieve 40x speedups on algorithms that in turn allow it run thousands of iterations and find the best model.

    Room LL21C – Thursday, March 29, 4 PM to 6 PM.

Can’t make it to the event? Schedule a time to talk to one of our makers!

Thank You for an Incredible H2O World

Thank You for an Incredible H2O World

#H2OWorld 2017 was an incredible experience!

It was wonderful to gather with community members from all over the world for more than 50 interesting presentations and so many great conversations.

H2O World kicked off at the Computer History Museum with a keynote by CEO and Co-Founder, Sri Ambati, on the Maryam-Curie stage.

Sri’s keynote was followed by more than 20 presentations on the first day from community members at innovative organizations like BeeswaxIO, Business Science, Change Healthcare, Comcast, Equifax, NVIDIA, PayPal, Stanford University, Wildbook and many others.

The second day started with a keynote from Professor Rob Tibshirani focused on “An Application of the Lasso in Biomedical data sciences”.

Professor Tibshirani’s keynote was followed by more than 25 presentations from leading organizations including Amazon’s A9,, Capital One, Digitalist Group, IBM, MapD, NVIDIA, QQ Trend, Stanford Medicine and more.

I’d love to say thank you to everyone who joined us at H2O World. We are incredibly grateful for your continued encouragement and feedback.

Thank you also to our talented team and Shiloh Events for planning such an amazing event.

Looking forward to the next H2O World!

Happy Holidays,


Director of Community

P.S. Want to share how you’re using products? I’d be thrilled to hear from you! Drop me a note.

Interview with Carolyn Phillips, Sr. Data Scientist, Neurensic

During Open Tour Chicago we conducted a series of interviews with data scientists attending the conference. This is the second of a multipart series recapping our conversations.

Be sure to keep an eye out for updates by checking our website or following us on Twitter @h2oai.

AAEAAQAAAAAAAAeRAAAAJGZmMWZiMGE1LTVlMDgtNGQwZi05NzYyLTEwMTMxNDhmODcwMw How did you become a data scientist?

Phillips: Until very close to two months ago I was a computational scientist working at Argonne National Laboratory. Okay.

Phillips: I was working in material science, physics, mathematics, etc., but I was getting bored with that and looking for new opportunities, and I got hired by a startup in Chicago. Yes.

Phillips: When they hired me they said, “We’re hiring you, and your title is Senior Quantitative Analyst,” but the very first day I showed up, they said, “Scratch that. We’ve changed your title. Your title is now Senior Data Scientist.” And I said, “Yes, all right.” It has senior in it, so I’m okay going with that. Nice. I like it.

Phillips: So I’m a mathematician, physicist and computer scientist by training who likes to solve problems with data and algorithms, and so now I’m a data scientist. That’s impressive. I don’t know if people have really wrapped their head around what it means to be a data scientist.

Phillips: I will say that one of the reasons why I started looking around for data scientist positions is that I come from an academic research background. I have a PhD in physics and computing, and a lot of my peers who have a very, very similar background to me – we did research together, we wrote papers together – became frustrated with academic research for various reasons. Many of them said, “Well, rats. I have a skill set that’s valuable,” and they’ve become data scientists. They work at places like Airbnb, they work at consulting firms, they work at startups. Each one of us has reached that point where we’ve said, “I’m frustrated with being an academic researcher.” I saw the direction that many of my peers had gone in saying, “I have a good skill set and it is valuable, and the place right now where that is being valued is in this area called data science, and I shall go into it,” and I said, “That’s a good idea. I’ll do that too.” There you go. That’s my story. Wow, that’s really cool, yeah. I mean, I’m finding that the more people I talk to the greater number of paths towards becoming a data scientist I find. So what’s your biggest pain point as a data scientist?

Phillips: Data preparation. We want to get more data from our companies, and theoretically all this data is being generated by the same software everywhere. But different companies configure that software differently, and it’s a lot of work to make sure all the data you get is formatted in the same way. Yes, I see.

Phillips: Everything I do has to have meaning. For example, I built this beautiful algorithm and I love it, and I applied it to the data, and we found this result in the data, and we said, “What is that? Look at that. Oh, my goodness, what is that? What is that? That’s crazy. That’s terrible, you know, we have to get right on that.” Yes.

Phillips: And I thought, well, before we get too excited, let me dig down to the original raw data that generated this. Dig, dig, dig, dig, dig. Oh, we assumed that data would always come in this format, and this data came in that format, and at the end of the day it looked like something it wasn’t, so I feel like that’s actually the big challenge. Oh, very interesting. Do you have methods of making your data more uniform?

Phillips: Well, I’m not responsible for that directly, but no. Every time we get in a new source of data it’s going to be this painful process of normalizing it so that it looks as much as possible like the other sources of data. Thank you so much, Carolyn. That was really helpful information. It was a pleasure meeting you.

Phillips: You too.

Interview with Svetlana Kharlamova, ­Sr. Data Scientist, Grainger

During Open Tour Chicago we conducted a series of interviews with data scientists attending the conference. This is the first of a multipart series recapping our conversations.

Be sure to keep an eye out for updates by checking our website or following us on Twitter @h2oai.

Svetlana Kharlamova How did you become a data scientist?

Kharlamova: I’m a physicist. Okay.

Kharlamova: I came here from the academia of physics. I worked for seven years in academia for physics and math, and four years ago I switched to finance to be more of a math person than a physics person. I see.

Kharlamova: And from finance I came to the data industry. At that time data science was booming. Oh, okay.

Kharlamova: And I got excited with all new the stuff and technologies coming up, and here I am. Okay, nice. So what business do you work for now?

Kharlamova: I work for Grainger. We’re focused on equipment distribution; serving as a connector between manufacturing plants, factories and consumers. So what are some of the problems that you guys are looking to solve?

Kharlamova: Building recommendation engines for customers. For that you need to leverage natural language processing and positive logic. What resources do you use to stay on top of the information in the data science world? Are there blogs that you read or like, or places that you go?

Kharlamova: Staff communities and data science communities are important sources of information. Yes. That’s great. And is there any advice that you would have for someone who’s an up and coming data scientist, or someone who’s just generally interested in the field?

Kharlamova: Advice to somebody who’s generally interested in the field? Yes, about becoming a data scientist.

Kharlamova: It’s a difficult question, because if a person takes a one year course on Coursera or somewhere else on data science, it doesn’t mean that they’re a data scientist yet, because you need to see the problem in the big picture. Yes.

Kharlamova: You need to be able to identify the challenges, the problem and various solutions. You cannot explore everything. You need to narrow down your choice. Yes, okay.

Kharlamova: You also need to have substantial knowledge of mathematics, statistics and computer science. But understand that you don’t need to immediately start using a sophisticated random forest model. Maybe you can just use simple algebra. Maybe it’s a question of two plus two. Right.

Kharlamova: And then you don’t need all these assumptions and approximations. Because I’m a physicist, I like a defined correct answer much more than something fuzzy. To be successful as a data scientist you need to decide how best to approach a problem then find a solution that’s as simple as possible. Okay. I see. That’s great advice. So it’s not just about having the knowledge, but it’s also about having an approach that is, like you said, simple, that you can probably use more often to provide a clear answer. That’s great, great advice.

H2O Day at Capital One

Here at one of our most important partners is Capital One, and we’re proud to have been working with them for over a year. One of the world’s leading financial services providers, Capital One has a strong reputation for being an extremely data and technology-focused organization. That’s why when the Capital One team invited us to their offices in McLean, Virginia for for a full day of H2O talks and demos we were delighted to accept. Many key members of Capital One’s technology team were among the 500+ attendees at the event, including Jeff Chapman, MVP of Shared Technology, Hiren Hiranandani, Lead Software Engineer, Mike Fulkerson, VP of Software Engineering and Adam Wenchel, VP of Data Engineering.

A major theme throughout the day was “vertical is the new horizontal,” an idea presented by our CEO Sri Ambati, about how every company is becoming a technology company. Sri pointed out that software is becoming increasingly ubiquitous at organizations at the same time that code is becoming a commodity. Today, the only assets that companies can defend is their community and brand. Airbnb is more valuable than most hospitality companies, despite owning no property, and Uber is more valuable than most transportation companies, despite owning no vehicles. And if “software is eating the world” then artificial intelligence (AI) is eating software, as traditional rules-based models no longer cut it in today’s rapidly changing world.

Our partnership started about a year ago, where we met in California, and learned about the value proposition of H2O. To be honest, I think we were all floored by what we saw. – Jeff Chapman

This was obviously an important message for attendees at Capital One, who were looking to learn more about AI and machine learning. Of particular interest was how machine learning and AI can help with use cases such as personalization and fraud detection and how the technology can drive future data-driven decision making. Attendees also had a chance to share their experiences using H2O to analyze and score models with their colleagues across business units. The event fit perfectly into’s vision of a grassroots community that encourages cooperation and the sharing of information. We look forward to continuing to work with Capital One, and all of our partners, to promote the democratization of data science and the growth of open source communities.

Visit us online to find a local event where you can meet with the makers of H2O in-person. Please also don’t forget to see the video of our time at Capital One here!

Drink in the Data with H2O at Strata SJ 2016

It’s about to rain data in San Jose when Strata + Hadoop World comes to town March 29 – March 31st.

H2O has a waterfall of action happening at the show. Here’s a rundown of what’s on tap.
Keep it handy so you have less chance of FOMO (fear of missing out).

Hang out with H2O at Booth #1225 to learn more about how machine learning can help transform your business and find us throughout the conference:

Tuesday, March 29th

Wednesday, March 30th

  • 12:45pm – 1:15pm Meet the Makers: The brains and innovation behind the leading machine learning solution is on hand to hack with you
    • #AskArno – Arno Candel, Chief Architect and H2O algorithm expert
    • #RuReady with Matt Dowle, H2O Hacker and author of R data.table
    • #SparkUp with Michal Malohlava principal developer of Sparkling Water
    • #InWithErin – Erin LeDell, Machine Learning Scientist and H2O ensembles expert
  • 2:40pm – 3:20pm H2O highlighted in An introduction to Transamerica’s product recommendation platform
  • 5:50pm – 6:50pm Booth Crawl. Have a beer on us at Booth #1225
  • 7:00pm – 9:00pm Let it Flow with H2O – Drinks + Data at the Arcadia Lounge. Grab your invite at Booth #1225

Thursday, March 31st

  • 12:45pm – 1:15pm Ask Transamerica. Vishal Bamba and Nitin Prabhu of Transamerica join us at Booth #1225 for Q&A with you!

The Top 10 Most Watched Videos From H2O World 2015

Now that we’re a few months out from H2O World we wanted to share with you all what the most popular talks were by online viewership. The talks covered a variety of topics from introductions, to in-depth examinations of use cases, to wide-ranging panels.

Introduction to Data Science
Featuring Erin LeDell, Statistician and Machine Learning Scientist,
An introductory talk for people new to the field of data science.

Intro to R, Python, Flow
Featuring Amy Wang, Math Hacker,
A hands-on demonstration of how to run H2O in R and Python and an introduction to the Flow GUI.

Machine Learning at Comcast
Featuring Andrew Leamon, Director of Engineering Analysis, Comcast and Chushi Ren, Software Engineer, Comcast
An inside look at how Comcast leverages machine learning across its business units.

Migrating from Proprietary Analytics Stacks to Open Source H2O
Featuring Fonda Ingram, Technical Manager,
A ten-year SAS veteran explains how to migrate from proprietary software to an open source environment.

Top 10 Data Science Pitfalls
Featuring Mark Landry, Product Manager,
A Kaggle champion offers an overview of ten top pitfalls to avoid when performing data science.

Featuring Erin LeDell, Statistician and Machine Learning Scientist,
Another popular talk from Erin, this time providing an overview specifically of ensemble learning.

Sparkling Water
Featuring Michal Malohlava, Software Engineer,
An introduction to Sparkling Water, H2O’s Spark API, by one of its key architects.

Panel – Competitive Data Science
Featuring Arno Candel, Chief Architect,, Phillip Adkins, Data Scientist, Banjo, Nick Kridler, Data Scientist, Stich Fix, Mark Landry, Product Manager,, John Park, Principal Data Scientist, Hewlett-Packard Enterprise, Lauren Savage, Data Scientist, AT&T and Guocong Song, Data Scientist, Playground.Global
A panel discussion covering all aspects of competitive data science.

Survey of Available Machine Learning Frameworks
Featuring Brenden Herger, Data Scientist, Capital One
An overview of available machine learning frameworks and an analysis of why teams use specific ones.

Panel – Industrial Data Science – Practitioners’ Perspective
Featuring SriSatish Ambati, CEO & Cofounder,, Xaviar Amatriain, VP of Engineering, Quora, Scott Marsh, Research & Development Analyst, Progressive Insurance, Taposh Dutta Roy, Manager, Kaiser Permanente, Nachum Shacham, Principal Data Scientist, PayPal and Daqing Zhao, Director of Advanced Analytics, Macy’
A discussion of large data science deployments by the people most familiar with them.

A great selection of talks if we do say so ourselves! Is it too early to start counting the days to H2O World 2016?

H2O World from an Attendee’s Perspective

Data Science is like Rome, and all roads lead to Rome. H2O WORLD is the crossroad, pulling in a confluence of math, statistics, science and computer science and incorporating all avenues of business. From the academic, research oriented models to the business and computer science analytics implementations of those ideas, H2O WORLD informs attendees on H2O’s ability to help users and customers explore their data and produce a prediction or answer a question.

I came to H2O World hoping to gain a better understanding of H2O’s software and of Data Science in general. I thoroughly enjoyed attending the sessions, following along with the demos and playing with H2O myself. Learning from the hackers and Data Scientists about the algorithms and science behind H2O and seeing the community spirit at the Hackathons was enlightening. Listening to the keynote speakers, both women, describe our data-influenced future and hearing the customer’s point of view on how H2O has impacted their work has been inspirational. I especially appreciated learning about the potential influence on scientific and medical research and social issues and H2O’s ability to influence positive change.

Curiosity led me to delve into the world of Data Science and as a person with a background of science and math, I wasn’t sure how it applied to me. Now I realize that there is virtually no discipline which cannot benefit from the methods of Data Science and that there is great power in asking the right questions and telling a good story. H2O WORLD broadened my horizons and gave me a new perspective on the role of Data Science in the world. Data science can be harnessed as force for social good where a few people from around the globe can change the world. H2O World 2015 was a great success and I truly enjoyed learning and being there. at ODSC SF 2015!

As promised, we’re here reporting from the floor of the ( Open Data Science Conference (ODSC). It’s been another wild day for us, with an early start at 7:30am to set up ahead of the show. However, the long days are all worth it for a chance to see you all in the field. While we thought bringing two boxes of booklets would be enough we ended up running out again!

Located in the luxurious Marriott Waterfront hotel ODSC is hosting 20 workshops, 50 speaking sessions and a thousand attendees. Speakers include Brian Granger, co-founder of Jupyter, Anthony Goldbloom, CEO & Founder of Kaggle, Andre Mueller, Assistant Research Engineer at the NYU Center for Data Science, and Wes McKinney, creator of the Pandas Python Data Analysis Library. On Sunday data scientist Hank Roark will joining this list of prestigous speakers to give a talk on “Big Data Machine Learning and Data Products with Python and H2O“. Looking forward to seeing you there!

Questions? Tweet us @h2oai