Product, Use Cases, Partners

Scalable predictive analytics with Palladium and Marathon

For more than five years, DC/OS has enabled some of the largest, most sophisticated enterprises in the world to achieve unparalleled levels of efficiency, reliability, and scalability from their IT infrastructure. But now it is time to pass the torch to a new generation of technology: the D2iQ Kubernetes Platform (DKP). Why? Kubernetes has now achieved a level of capability that only DC/OS could formerly provide and is now evolving and improving far faster (as is true of its supporting ecosystem). That’s why we have chosen to sunset DC/OS, with an end-of-life date of October 31, 2021. With DKP, our customers get the same benefits provided by DC/OS and more, as well as access to the most impressive pace of innovation the technology world has ever seen. This was not an easy decision to make, but we are dedicated to enabling our customers to accelerate their digital transformations, so they can increase the velocity and responsiveness of their organizations to an ever-more challenging future. And the best way to do that right now is with DKP.

Mar 31, 2015

Andreas Lattner

D2iQ

Editor's Note: The Business Intelligence Unit of the Otto Group (Otto Group BI) has released Palladium, an open source framework for easily setting up predictive analytics services. The Otto Group is a globally operating retail and services group and includes 123 major companies (among others, OTTO, Crate & Barrel, and Hermes). According to preliminary figures the company generated turnover of 12.057 billion euros in the 2014/15 financial year. This is a guest post by Dr. Andreas Lattner of Otto Group BI.

 

Palladium is a system for building predictive analytics services. Combined with Marathon and Mesos it becomes a highly-scalable and flexible platform for adding predictive intelligence to any business. Otto Group has made Palladium available as open source on GitHub under the Apache 2.0 license. The system supports tasks like fitting, evaluating, storing, distributing, and updating models. It is shipped with a script to automatically create Docker images for Palladium services and its documentation also provides an example how to setup Mesos and Marathon in order to manage a set of Palladium service instances.

 

The Mesos and Mesosphere stack is a perfect complement to Palladium on the datacenter layer and takes smooth deployment of machine learning models to a new level of ease. With these two technologies combined, setting up scalable and reliable predictive analytics services is extremely straightforward. Palladium tightly integrates the Python machine learning library scikit-learn. Moreover, it lets analysts expose R and Julia models for production. Many other features important for enterprises are included, such as authentication, central logging, and monitoring. The system also has support for pluggable decorators so deployment-specific custom features can be easily added.

 

Why We Built Palladium

 

The Data Science Team of the Otto Group BI encountered the challenge to set up a scalable high-performance solution for time window prediction for parcel delivery times of logistics provider Hermes Germany. While developing the time window prediction service in close collaboration with Hermes, the team realized that such a solution would also be helpful in many other projects — within the Otto Group and beyond — and the idea of developing Palladium emerged. The major motivation behind Palladium was to reduce the transition time from predictive analytics research prototypes to actual productive services capable of meeting internal SLAs. A major goal, also, was to enable rapid scaling of these predictive analytics services and efficient resource utilization.

 

Palladium is now used for all new predictive analytics services that the corporate BI unit provides to the Otto Group (including services related to on-site search, product curation, and product demand intelligence). The team believes that the Palladium system will be helpful for other organizations to develop reliable and scalable predictive analytics services. Using Palladium should help organizations reduce the development and deployment costs of new predictive analytics services and let organizations focus on developing and fine-tuning the core machine learning models.

 

Setting up Palladium-Based Services

 

In order to use Palladium for developing a service, you have to install Palladium as described in its documentation. You can either install it from source or install it with pip install palladium. Palladium's tutorial describes how to create a sample service that will classify Iris flower species based on different features (sepal/petal width and length). Details on how to set up this service can be found in Palladium's tutorial. In summary, the following steps have to be done:

 

1) Write a Palladium configuration file (e.g., config.py) and provide training data. This configuration file specifies where training (and testing) data can be found, what model should be learned, where to persist trained models, what features and parameters can be specified when using the service, and what schedule should be used to check for a new model. You can use the config.py and iris.data samples provided in Palladium's tutorial, also located in the examples/iris folder of Palladium's GitHub repo.

 

2) Fit a model: Run the pld-fit command after having specified what configuration to use, e.g.:

 

export PALLADIUM_CONFIG=config.pypld-fit

 

3) Once the model has been fitted, it can be exposed as a web service. We use the Flask library for web development. The service can be easily tested by running the following command which is using Flask's built-in web server:

 

pld-devserver

 

4) If you send a request, e.g., http://localhost:5000/predict?sepal%20length=5.2&sepal%20width=3.5&petal%20length=1.5&petal%20width=0.2, you receive the classification result of this service:

 

{    "result": "Iris-virginica",    "metadata": {        "service_name": "iris",        "error_code": 0,        "status": "OK",        "service_version": "0.1"    }}

 

There are further commands, among others, to test a model, to find good parameters via a grid search, and to list available models. Details can be found in Palladium's documentation.

 

The Deployment section of Palladium's documentation also describes how to automatically create Docker images for Palladium-based services. In summary, the following steps have to be done:

 

1) Create a Palladium base Docker image (if you have not done it before). Run the command in the folder where you have the create_base.sh script available:

 

sudo create_base.sh <path_to_palladium> <owner/palladium_base_name:version>

 

2) Create the service specific Docker image by running the command in the folder where the create.sh script is located. As we create a stand-alone version of the service without dedicated server for training models, we have to remove the comment of the line "#RUN pld-fit" in the create.sh script in advance. Then, we run:

 

create.sh <path_to_app_folder> <owner/palladium_app_name:version> <owner/palladium_base_name:version>

 

3) By listing the existing images, we should see the Palladium base image as well as a specific image for the service:

 

sudo docker images

 

In the following section we describe how a Palladium service's Docker image can be deployed with Marathon.

 

Deploying Palladium-Services Using Marathon

 

For the installation of Mesos and Marathon you can follow the guide on Mesosphere. If you want to try it out locally first, we recommend you set up a single node Mesosphere cluster.

 

Before adding a new application to Marathon you need to make sure that the Mesos nodes and Marathon are configured properly to work with Docker. To do so, follow the steps as described in the Marathon documentation.

 

Once you have Mesos and Marathon up and running, you can add the Palladium-based Iris service via Marathon's REST API. You have to do the following steps:

 

1) Create a json configuration file (using the Docker image name you have specified earlier with the suffix _predict), e.g.:

 

{  "id": "palladium-iris",    "container": {        "docker": {            "image": "user/palladium-iris_predict:0.1",            "network": "BRIDGE",            "parameters": [            ],            "portMappings": [                { "containerPort": 8000, "hostPort": 0, "servicePort": 9000,                  "protocol": "tcp" }            ]        },        "type": "DOCKER",        "volumes": [         ]    },    "cpus": 0.2,    "mem": 256.0,    "instances": 3,    "healthChecks": [        {            "protocol": "HTTP",            "portIndex": 0,            "path": "/alive",            "gracePeriodSeconds": 5,            "intervalSeconds": 20,            "maxConsecutiveFailures": 3        }    ],    "upgradeStrategy": {        "minimumHealthCapacity": 0.5    }}

 

2) Send the json application file to Marathon via POST (assuming Marathon is available at localhost:8080):

 

curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d @<path-to-json-file>

 

You can now see the status of your Palladium service instances using the Marathon web user interface (available at http://localhost:8080 if you run the single node installation mentioned above) and can scale the number of instances as desired.

 

Marathon keeps track of the Palladium instances. If a service instance breaks down, a new one will be started automatically. If you click on the app, you can see the actual ports where the different service instances can be reached. You can try it out by sending a request to one of the service's instances (if listening to port 31000), e.g.: http://localhost:31000/predict?sepal%20length=5.2&sepal%20width=3.5&petal%20length=1.5&petal%20width=0.2.

Ready to get started?