Skip to content

Bodywork + MLflow

This tutorial enables you to experiment with Bodywork and MLflow combined into a single open-source MLOps stack. Bodywork is a tool that focuses on the deployment of machine learning projects to Kubernetes. MLflow is a tool for managing the machine learning lifecycle (tracking metrics and managing ML arteacts, such as trained models).

We have developed an example train-and-serve pipeline to demonstrate Bodywork and MLflow working side-by-side, which you can explore in this GitHub repository. The pipeline uses MLflow to track the training metrics and manage trained models. The pipeline consists of two stages, defined in two executable Python modules:

  1. train_model.py - run a batch job to train a model, logging metrics and registering models to MLflow.
  2. serve_model.py - loads the latest 'production' model from MLflow and then starts a simple Flask app to handle requests for scoring data.

The details of the deployment are described in the bodywork.yaml configuration file. When a deployment is triggered, Bodywork instructs Kubernetes to start pre-built Bodywork containers, that pull the code from the demo project's Git repo and run the executable Python modules. Each stage is associated with one Python module and is run, in isolation, in it's own container.

Launch the test drive below and follow the steps to see this pipeline in action!

bodywork

Step 0 - Launch the Test Drive

Note: the test drive doesn't work in Safari yet. Please use Chrome or Firefox for now! Also please note it won't work in Private/Incognito windows.

Use the following test drive to launch a temporary Kubernetes cluster with the tutorial running in it:

Launch Test Drive

At busy times, you may need to wait a few minutes for a test drive environment to become available.

Note that the environment will shut down automatically 1 hour after you start using it.

Step 1 - Deploy the Pipeline

To test the deployment using a local workflow-controller that streams logs to stdout, run,

$ bodywork workflow \
    --namespace=bodywork \
    https://github.com/bodywork-ml/bodywork-pipeline-with-mlflow \
    master

Once the deployment has completed, browse to the MLflow UI to check that the model metrics that were logged to the iris-classification experiment during training, and to confirm that the trained model, iris-classifier--sklearn-decision-tree, was registered and promoted to 'production'.

Step 2 - Test the Scoring Service

Requests to score data can now be sent to the scoring service. Try the following in the shell,

$ curl http://localhost:31380/bodywork/bodywork-mlflow-demo--scoring-service/iris/v1/score \
    --request POST \
    --header "Content-Type: application/json" \
    --data '{"sepal_length": 5.1, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2}'

Which should return,

{
    "species_prediction":"setosa",
    "probabilities":"setosa=1.0|versicolor=0.0|virginica=0.0",
    "model_info": "DecisionTreeClassifier(class_weight='balanced', random_state=42)"
}

According to how the payload has been defined in the serve_model.py module.

Running the ML Pipeline on a Schedule

If you're happy with the test results, you can schedule the workflow-controller to operate remotely on the cluster, on a pre-defined schedule. For example, to setup the the workflow to run every hour, use the following command,

$ bodywork cronjob create \
    --namespace=bodywork \
    --name=train-and-deploy \
    --schedule="0 * * * *" \
    --git-repo-url=https://github.com/bodywork-ml/bodywork-bodywork-mlflow-demo-project \
    --git-repo-branch=master

Each scheduled workflow will attempt to re-run the batch-job, as defined by the state of this repository's master branch at the time of execution.

To get the execution history for all train-and-deploy jobs use,

$ bodywork cronjob history \
    --namespace=bodywork \
    --name=train-and-deploy

Which should return output along the lines of,

JOB_NAME                                START_TIME                    COMPLETION_TIME               ACTIVE      SUCCEEDED       FAILED
train-and-deploy-1605214260             2020-11-12 20:51:04+00:00     2020-11-12 20:52:34+00:00     0           1               0

Cleaning Up

To clean-up the deployment in its entirety, delete the namespace using Kubectl - e.g. by running,

$ kubectl delete ns bodywork

Make this Project Your Own

This repository is a GitHub template repository that can be automatically copied into your own GitHub account by clicking the Use this template button above.

After you've cloned the template project, use official Bodywork documentation to help modify the project to meet your own requirements.