MLflow Made Easy: Your Beginner’s Guide

Simplify Your Machine Learning Workflow with MLflow: A Comprehensive Overview

Sagar Thacker

Published in

Level Up Coding

12 min readSep 5, 2023

Have you ever felt overwhelmed by the constant advice to ‘track your experiments’? If so, you’re not alone.

When I first started, terms like ‘experiment,’ ‘experiment run,’ and ‘artifacts’ seemed bewildering. In this blog, I’ll demystify these concepts with a hands-on approach.

We’ll kick off the article with some code, running it first. Then, we’ll delve into the concepts and explore more code examples.

If you haven’t got mlflow on your computer yet, no biggie! Just pop this in your command prompt (or terminal): pip install mlflow. Or, if you’re using ‘conda’, try this out: conda install -c conda-forge mlflow.

Also, we’ll be using the Iris plants dataset to build a simple classification model.

Let’s get started by importing the necessary libraries:

# Import necessary libraries
import mlflow
import pandas as pd

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score

What’s next? We’re diving into data now. We’ll grab the dataset and split it into two parts: one for training, one for testing:

# Load Iris dataset
dataset = load_iris()

# Split dataset into train and test
X_train, X_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.2, stratify=dataset.target)

# Checkout train and test datasets
print(f"Train set shape: {X_train.shape[0]} rows, {X_train.shape[1]} columns", )
print(f"Test set shape: {X_test.shape[0]} rows, {X_test.shape[1]} columns")
print(f"Columns names: {dataset.feature_names}")

# Output:
# Train set shape: 120 rows, 4 columns
# Test set shape: 30 rows, 4 columns
# Columns names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

Get ready, we’re building a neat classification model using Logistic Regression with the help of mlflow to track our experiments!

# Set the experiment for MLflow
mlflow.set_experiment('Baseline Model')

# Start an MLflow run context
with mlflow.start_run():
    # Initialize a LogisticRegression model
    model = LogisticRegression()
    
    # Train the model on the training data
    model.fit(X_train, y_train)

    # Make predictions on the test data
    y_pred = model.predict(X_test)

    # Calculate various evaluation metrics
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred, average='macro')
    recall = recall_score(y_test, y_pred, average='macro')

    # Log the evaluation metrics to the MLflow run
    mlflow.log_metric('accuracy', accuracy)
    mlflow.log_metric('precision', precision)
    mlflow.log_metric('recall', recall)

    # Log the trained model to the MLflow run
    mlflow.sklearn.log_model(model, 'model')

    # Set developer information as a tag
    mlflow.set_tag('developer', 'Sagar Thacker')

    # Set preprocessing details as a tag
    mlflow.set_tag('preprocessing', 'None')

    # Set the model type as a tag
    mlflow.set_tag('model', 'Logistic Regression')

Let’s understand what the above code does:

The code starts by setting the MLflow experiment to “Baseline Model”. This means that the experiment’s runs and associated metadata will be organized under this experiment’s name in the MLflow tracking system.
Inside a with mlflow.start_run(): block, a new MLflow run context is initiated. This context encapsulates the tracking of all information related to this specific run within the experiment.
We train a Logistic Regression model, predict outcomes on the test data, and calculate metrics to compute the model performance.
The computed evaluation metrics (accuracy, precision, and recall) are logged as metrics within the current MLflow run using the mlflow.log_metric() function.
The trained LogisticRegression model is logged as an artifact in the MLflow run using the mlflow.sklearn.log_model() function.
Additional metadata is associated with the run by setting tags using mlflow.set_tag().

Don’t fret if the concept of an experiment or experiment run isn’t crystal clear just yet. We’ll delve into it shortly.

Here comes the cool part! Open up your command prompt (or terminal) and type this: mlflow ui. Magic! You’ll see the Tracking UI come alive at http://localhost:5000.

Click on the Run Name to see the details of the experiment run. You’ll see a page like this:

MLlfow Experiment Run details page — Screenshot by Author

Voila! You’ve just tracked your first experiment run! I encourage you to go ahead and tinker with the UI to get the hang of it!

Alright, before we move ahead, let’s get friendly with the idea of experiment tracking.

Experiment

Imagine an experiment like a big, cool umbrella that holds together all your curious thoughts and tests. Like, say you start with an experiment named Baseline Model — that’s like your first big idea. Then comes another experiment, the Vanilla Models, where you tinker with stuff like Random Forest and SVM. And hey, you could even go deeper, trying things like tuning hyperparameters. Each of these is a new ‘experiment’ — a fresh adventure.

Experiment Run

By the way, in each of these adventures, when you take a shot at creating or training a model, it’s like a little experiment within the big one — that’s an ‘experiment run’. For example, when we worked on our baseline model, an experiment run called illustrious-mule-887 got created. And guess what? The computer is clever enough to come up with these names automatically! Or if you’re feeling imaginative, you can give your experiment run a custom name.

So, here’s the deal: experiment tracking is like having a trusty notebook where you jot down everything. This includes details about which model you’re using, the results you’re getting, precise versions of packages you’re using, and so on.

There you have it — experiment tracking is your method of keeping track of all the awesome stuff you’re creating.

Taking a Closer Look at MLflow

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. MLflow functions that we’ll explore in this article are:

MLflow Tracking: An API to log parameters, code, outcomes of an experiment and compare them with an interactive UI. (We saw this in our example above)
MLflow Models: A format for packaging models and tools that make it simple to deploy the same model (from any ML library) for batch and real-time scoring on platforms like Docker, Apache Spark, Azure ML, and AWS SageMaker. (checkout the logged model format)
MLflow Model Registry: Model versioning, stage transitions, and annotations can all be managed collaboratively with the help of the MLflow Model Registry, which serves as a central model store.

MLflow Tracking

Mhmm… if you asked yourself “Where did MLflow store all that data?”; then you are asking the right question.

You might have noticed a mlruns directory created in same directory where your code ran. MLflow used this directory to store artifacts and MLflow entities.

Below is the snapshot of my mlruns directory:

.
├── mlruns
│   ├── 0
│   │   └── meta.yaml
│   ├── 758224673514181683                    <- (experiment id of Baseline Model)
│   │   ├── 33059883f56c4f208de0afd8eed42315  <- (run id of Logistic Regression run)
│   │   │   ├── artifacts                     <- (model stored here)
│   │   │   ├── meta.yaml
│   │   │   ├── metrics                       <- (metrics stored here)
│   │   │   ├── params                        <- (params stored here)
│   │   │   └── tags                          <- (tags stored here)
│   │   └── meta.yaml
│   └── models
└── notebook.ipynb                            <- (my code file)

# You can cross check the id's associated with the 
# experiement & experiment run in the above images.

How does MLflow record runs and artifact?

MLflow uses two components for storage:

Backend Store:
- It is used to persists MLflow entities such as runs, parameters, metrics, tags, notes, metadata, etc.
- The backend store can be stored to local files (like in our above example), to a SQLAlchemy-compatible database, or remotely to a tracking server.
Artifact Store:
- It is used to persists artifacts such as files, models, images, in-memory objects, or model summary, etc.
- The artifact store can be persisted to local files (like in our above example — the artifacts folder in experiment run) and a variety of remote file storage options such as S3 bucket.

If you noticed, artifacts are not just limited to models, but it can any output, such as models, intermediate data, visualizations, logs, and custom files.

There are 6 configuration scenarios to configure backend and artifact storage.

A scenario in which both the backend and artifact storage are configured on the localhost or local directory is called MLflow on localhost. It is the 1st scenario on the list of MLflow Tracking.

MLflow Tracking Server Scenario 1 — MLflow Official Documentation (link)

Given the breadth of all six scenarios, delving into them all here would stretch this article a bit too far. To ensure clarity, I’ll be creating separate articles that dive into each scenario in detail.

In the meantime, I recommend checking out the official documentation page for a comprehensive overview of these six scenarios.

MLflow Models

An MLflow Model is standard format to pack your machine learning models. It enables us to use the model in a variety of downstream tools. Think of it like putting your model in a box with different labels so that different tools know how to open it and use it. They call these different formats as “flavors”.

For example, if you have a machine learning model, you might want to use it in different situations:

Real-time Serving: You may want to serve your model in real-time through a REST API so that it can make predictions as soon as new data arrives.
Batch Inference: Alternatively, you might want to use the same model in a batch processing system like Apache Spark, where it processes a large amount of data in batches.

These two scenarios require slightly different preparations or “flavors” of the same model. So, “flavors” here means different configurations or formats of the model to suit different use cases or tools.

Storage Format

In our example, we logged our model as a mlflow.sklearn flavor.

# Log the trained model to the MLflow run (From above code snippet)
mlflow.sklearn.log_model(model, 'model')

MLflow Model Artifact — Screenshot by Author

Few key concepts about MLflow model:

MLflow Model Directory: Each MLflow Model is organized as a directory containing various files, including an MLmodel file in the root directory. It defines multiple flavors in which model can be viewed in.
MLmodel File: All the flavors that a specific model supports are outlined in its MLmodel file, which is typically written in YAML format. This file serves as a reference for understanding how to use the model and its different flavors in various deployment environments.
Flavors: Flavors are a central concept in MLflow Models. They are a convention that allows deployment tools to understand how to use a model. By defining different flavors, you can adapt the model to various deployment scenarios without needing to integrate each tool with every machine learning library.
Standard Flavors: MLflow includes predefined or “standard” flavors that its built-in deployment tools support. For instance, there’s a “Python function” (python_function) flavor that explains how to run the model as a Python function. These standard flavors make it easier to deploy models consistently.
Custom Flavors: While MLflow provides standard flavors, libraries can also define and use their own flavors. For example, the mlflow.sklearn library can load models as scikit-learn Pipeline objects for use in scikit-learn-aware code or as generic Python functions for tools that only need to apply the model.

I also encourage you to explore concepts like Model Signature and Input Examples. MLflow models allow us to include additional metadata about model inputs, outputs, and params that can be utilized by downstream tools. Also, demonstrate an example of a valid input to our model.

You can read more about these topics on the official documentation.

MLflow Model Registry

Perhaps you’re wondering, ‘I’ve already saved or logged my model as an artifact, so why do I need to know about the model registry?’ Let’s consider a scenario: you’ve run 5 experiments, and in each run, you’ve logged a model. After thorough evaluation, you’ve found that the 4th model performs the best, and you want to deploy it in your application.

Now, some essential questions arise:

What stage is my model in? Is it in staging, production, or archived?

2. If someone asks which experiment and run produced that model, can you provide an answer?

3. In the future, when you’ve built a better model and want to replace the current one, how will you keep track of the old models?

This is where the Model Registry steps in to save the day. It provides an array of valuable features, including stage transitions, model lineage (for tracking the experiment and run that produced the model), model versioning, and much more.

Let’s see model registry in action!

Before you can add a model to the model registry, you must log the model using the log_model method of the corresponding model flavor.

There are four ways you can register your model:

MLflow UI
MLflow model flavor
MLflow register model
MLflow Client Tracking API

MLflow UI

MLflow Register Model — Screenshot by Author

You’ll find a big blue button named Register Model under the Artifacts section in the MLflow Runs details page. Click on it!
If you are adding a new model, specify a Model Name to uniquely identify the model. If you are registering a new version to an existing model, you can pick the existing Model Name from the dropdown.
Click on Register!

Congratulations, you registered your new model! It would look similar to:

MLflow Registered Model Details Page — Screenshot by Author

If you click on Version 1 , you can navigate to the version details page. In the details page you can change model stage, see model lineage (source run), and other metadata.

MLflow model flavor

You can log and register you model during the experiment run. You can use the mlflow.<model_flavor>.log_model() method.

# Import libraries
# Prepare data and perform preprocessing...

with mlflow.start_run():
  model = LogisticRegression()

  # Train the model
  # Predict on test set
  # Log metrics and set tags

  # Log the sklearn model and register as version 1
  mlflow.sklearn.log_model(
    sk_model=model,
    artifact_path='model',
    registered_model_name='Logistic Regression'
  )

In the above code snippet we used the sklearn model flavor to log and register the model. If a registered model with the name doesn’t exist, the method registers a new model and creates Version 1. If a registered model with the name exists, the method creates a new model version.

MLflow register model

We use the mlflow.regiter_model() method to register the model.

There are two required arguments to the above method: model_uri and name .

model_uri:URI referring to the MLmodel directory. Use a runs:/ URI if you want to record the run ID with the model in model registry.

name: The name of the registered model.

# model_uri format: runs:/<RUN_ID>/<DIRECTORY_NAME>

result = mlflow.register_model(
    "runs:/f3f14056a49f48168af1b187f36e5aea/model", "Logistic Regression"
)

If a registered model with the name doesn’t exist, the method registers a new model and creates Version 1. If a registered model with the name exists, the method creates a new model version.

MLflow Client Tracking API

In this method we first need to create the Model Name otherwise it will throw an MLflowException .

from mlflow import MlflowClient

client = MlflowClient()

# Create Model Name if not exists
client.create_registered_model('Logistic Regression')

# Register the model
client.create_model_version(
  name='Logistic Regression',
  source='file:///<hidden-privacy>/mlruns/758224673514181683/817ae04665574b2abe5384144a9be015/artifacts/model',
  run_id="817ae04665574b2abe5384144a9be015'
)

source: Source path where the MLflow model is stored. You can copy the model artifact full path.

As a bonus for reading till the end, let’s explore how you can put the Model Registry into action during deployment.

Bonus: Model Registry usage during Deployment

In an automated deployment pipeline, it’s essential to automatically select the most recent registered model for inference. But what if you need the latest model from a specific stage? How can we achieve this functionality?

The answer lies in the MLflow Client Tracking API.

from mlflow import MlflowClient

client = MlflowClient()

# Filter params
MODEL_NAME = 'Logistic Regression'
STAGE = 'Staging'

# Search for the model with MODEL NAME and 
# version number in descending order
mlflow_model = client.search_model_versions(
  filter_string=f"name = '{MODEL_NAME}'", 
  order_by=["version_number DESC"]
)

# Out of all the models, find the one with 
# the current stage as Staging.
# Log that model
for model in mlflow_model:
  if model.current_stage == STAGE:
    clf = mlflow.sklearn.load_model(model_uri=f"models:/{model.name}/{model.version}")
    break

If you enjoyed this post, give it a clap! Share your thoughts and suggest topics you’d like me to explore in the comments.

If you like in-depth tutorials and guides, be sure to follow for more!