in Uncategorized

Google-scale Machine Learning & Deep Learning gets principal platform in Apache Mahout with Spark and H2O

H2O's vision is direct and simple: scaling machine learning for powering intelligent applications. Our focus is distributed machine learning and a fully-featured set of industrial grade algorithms.

Apache Mahout is where people learn their chops in Machine Learning. Like R, It's the “hello world” first place many new users get exposed to algorithms on big data. Making that experience beautiful, accessible and value-driven will make machine-learning ubiquitous and Mahout a movement to rival the success & utility of say, lucene and hadoop.

Apache Spark has great developer momentum and in-memory makes it ideal for implementing and extending algorithms.

Our vision and motivation is to re-ignite the community & double down on the identical founding visions of Mahout and H2O. Under one umbrella, Mahout can power intelligent applications for the enterprises and users.

Creating great software is hard, creating passionate communities is harder. Our belief is that a product is not complete without it's community. This convergence will make Mahout the principal platform for integrating multiple ways of mining insights from data.

These are exciting times for Mahout. These initiatives will drive momentum to the Mahout as the umbrella platform for Machine Learning. It's success will drive wide-scale adoption of scalable machine learning algorithms in the enterprise & H2O is committed to that unified vision. Spark is a terrific in-memory platform for that. Stratosphere will be another. Scala, R, Python, JS, Java and the Matrix APIs make it a polyglot modeling & programming universe. This will be fun.

We are excited at the possibilities of this convergence. A fan of Mahout 's vision and how it captured the imagination of machine learning enthusiasts over the years.. (Still fondly recollect Isabel's spirited talk at ApacheCon years ago!) A real product, hacker and an open source developer culture is the need. The R community has also been looking for a package that solved distributed frames (in-memory) & parallel packages for the algorithms behind. Our team has executed on a lots of these inspirations fast & furiously in open source over the past two years. We hope to enrich & fulfill the day-to-day workflows of the Machine Learning users world-wide through Apache Mahout.

It all starts with the end (ml) user experience and how we can make it better.