Productionizing Machine Learning Models: Lessons learned in the Hadoop Ecosystem

Till the conference starts! ✓ Register with 3+ colleagues and get 10 % off! ✓ Special discount for freelancers Register Now
Tuesday, June 18 2019
11:30 - 12:30
Asam 1

The deployment of machine learning models can be challenging, especially in the context of distributed systems. Although Python is the dominant language among data scientists, it can create friction when integrating with JVM-based tools such as Spark or managing application dependencies on clusters of heterogeneous machines. Many data scientists developing on such systems struggle with the subtleties of these challenges. 

This presentation will share lessons learned working on large-scale Hadoop clusters and examine the most promising approaches to alleviate common issues. In particular, we will discuss our experience with leveraging containerization to tackle the dependency management challenge from a data scientist’s point of view.

Behind the Tracks