Efficient Bulk Updates with Django Rest Framework
Using the ListSerializer with bulk_update to build efficient PUT API endpoints with Django Rest Framework

Generic rest framework endpoints are typically designed to modify one object at a time. However, you will often find that this can become a huge bottleneck to performance when you need to modify thousands of objects. In this case, instead of thousands of calls to your endpoint, it is better to do one call that does the operation in bulk. This tutorial show you how to create efficient bulk updates for your PUT API’s endpoints.
In Part 1: Efficient Bulk Creates With Django Rest Framework we went over how to optimize the POST API using Django Rest Framework. In this article, I’m going to show you have to improve the performance of your PUT API calls using ListSerializers with the bulk_update method. A full working Django app with code and unit tests can be found on GitHub here.
Objectives
By the end of this tutorial, you should be able to
- Implement a PUT API to update database models using standard Django Rest Framework workflow.
- Modify that API to perform batch updates using a ListSerializer.
- Profile and optimize the ListSerializer to achieve a 10x performance improvement by using bulk_update and reducing the number of calls to the database.
The toy application in this tutorial consists of a project that can have many tasks. Every time a new task is added or updated we also want to update our projects last modified date. Let us get started.
Django API Overview
You may feel the Django Rest Framework’s high level of abstraction makes it difficult to understand what is going on. I too found them to have a very steep learning curve at the beginning, but once you become familiar with the structure using them to manage a large project with many API’s can lead to large gains in productivity. You will also find that it makes it easier to collaborate if everyone on your team follows the same patterns. Typically, there are 5 main parts to creating Django Rest Framework API. They are as follows:
- Models that manage the relationship between a database table and python.
- Serializers which validate and serialize incoming and outgoing data
- Querysets which query construct, query, and store the results of database queries as model instances.
- Views which are classes that wrap up a model, serializer, queryset for each endpoint.
- URLs which specify when to call the View.
We will now go over the code for these to create our REST API.
Model
The Django ORM uses models to manage interactions with the backend database. For this tutorial, we create two models, a Project model that has Task models associated with it. The models are defined in models.py as:
Additionally, whenever a new Task is updated, the business logic requires that the Project last modified should be updated. To handle the business logic we use the signal API with a post_save and post_delete on the task. This way after a save is called, the signal will also be invoked updating the Project last_modified.
Views
The generic view classes are abstract methods from Django Rest Framework which are used to implement HTML methods POST/PUT/GET/DELETE. For the update endpoint, we will use the generic UpdateAPIView which provides a PUT method handler. To better understand the basic control flow of the Django update view, I’ve put together the following flow chart to use as a reference.

For the PUT API, we use the UpdateAPIView to perform updates on the Task model. For this, we need to define the serializer, the get_queryset method, and the URL of the API that invokes the function.
If you aren’t familiar with Django Rest Framework, on the thing you may notice in our TaskUpdateView code above, is that there isn’t a lot of code. For basic API’s almost all of the plumbing for making this work is completely abstracted away. In order to optimize this API, we will end up needing to override many of the inherited methods. For more in-depth documentation on views check out the official tutorial here.
QuerySet
The queryset is part of the Django ORM which facilitates interactions between your models and the database. Querysets are powerful abstractions that let you build complex database queries programmatically. In our get_queryset function, we are using a filter on the Task object to return only the task that have a user-specified project as the foreign key and task id. The queryset contains a list of instances that we can update and then save back to the database with the data passed in by the user.
Serializers
Serializers are responsible for taking the user input, validating it, and turning it into an object consumable by the database. They also handle converting objects from the database into something that can be returned to the user. Additionally, a serializer specifies which fields are required and what properties they have. To better demonstrate the basic control flow for Serializer I created the following diagram for reference. For an in-depth overview of serializers see the official tutorial here.

For this project, we create the TaskSerializer which can update an object a name and description.
Since the id for the Project is specified in the URL, we use the Hidden field with a CurrentProjectDefault class to specify how to pull the project id from the request and retrieve the project object from the database. The CurrentProjectDefault class is defined as follows.
With all of these pieces in place, we have completed the basic implementation of our API. Let us go ahead and profile the performance. To do that we will use pytest to create a unit test where we update 10,000 Task objects by calling the API once for each update.
To get the run time of the test, we will set the duration flag when calling pytest.
>> py.test --durations=1
==================
101.21s call test_update_task
As you can see it is so slow! Performing 10,000 updates takes around 100 seconds.
ListSerializer
Let us look at how to speed up the performance of the code. The first optimization we will do is to switch to using a ListSerializer. The list serializer will allow you to submit a single request for multiple updates. First, we will create the UpdateListSerializer class, which extends the ListSerializer.
By computing the instance_hash we avoid needing to index into the instance which is very inefficient. Then we will modify our TaskSerializer’s Meta properties to use the new list serializer class.
We will also need to modify our url, now it will only take project_id as part of the path, and the task_id will be included as part of data object sent to the PUT API.
Finally, we will create a new view; TaskUpdateListView. Here we will overwrite the base get_serializer method to check for input data that is a list. When we detect input the user has passed a list in, we will set the property kwargs[“many”] = True. This tells the serializer that is should use the list_serializer_class before calling the individual updates on each Task.
We also overwrite the base update method of the view. It will now perform a simple validation on the ids before calling a modified get_queryset method which takes the ids as input. The queryset will return the instances for all of the Task objects that the user has requested to update.
With all of these optimizations in place, we are now using a ListSerializer do a bulk update of the Task models. We will create another unit test to profile the performance.
Lets’ go ahead and check the performance from that change.
>> py.test --durations=2
==================
101.21s call test_update_task
55.98s call test_update_list_serializer
So just by adding a ListSerializer method we can see that a 2x performance improvement. Still, 55 seconds for 10,000 updates are slow. The key to further optimizing the performance of this API is going to be reducing the number of calls to the database.
Consolidate Logic
Currently, our serializer is calling CurrentProjectDefault to get the project that is associated with each Task instance object it is creating. Instead, we are going to modify the put function of our view to pull the project and insert it into the request.data object. This way, we only need to do a single database hit to get the project for all of our Tasks.
We will also need to replace the CurrentProjectDefault field in our serializer with a custom field. We create a custom field named ModelObjectidField which returns just returns the data passed into it.
bulk_update
Next, we will create a BulkUpdateListSerializer, which will use the Django’s bulk_update introduced in Django 2.2. This function allows you to perform bulk updates in the database by passing a list of instances to update. The following code describes the BulkUpdateListSerializer.
We also need to modify the serializer so that it no longer does a save on the update method, but only returns the new instances. Then after we have updated the values of all the instances, our BulkUpdateListSerializer will call the bulk_update method which will make a single database call to perform the update.
to_representation
The to_representation function is part of the serializer that handles how instances are turned into serialized objects that can be returned to the user. The default to_representation method of the ListSerializer is very inefficient when getting the instance.project id value to return. In the following to_representation code, we take advantage of the fact that all of the project ids are the same so we only need to fetch that property once.
What about our signals?
When doing a bulk_update the signals for the models are no longer triggered. This is actually a good thing as, while convenient, signals can be incredibly inefficient. Instead, we create a update_project_last_modified function which updates the last modified date of the Project after the update is performed.
One thing to note is that the bulk_update function does not modify auto_add_now fields in the database. To overcome this, we explicitly set the last_modified field for all of our instances so that the update will perform them.
Finally, let us test the performance of our new function using the bulk_update.
And the results are:
>> py.test --durations=3
==================
101.21s call test_update_task
55.98s call test_update_list_serializer
12.06s call test_bulk_update_list_serializer
As you can see the test now runs in around 20 seconds. That is about a 10x speed improvement comes without too much extra code complexity.
Summary
With that, we have looked at how you can improve the performance of your Django app using a ListSerializer and the bulk_update functionality introduced in Django 2.2. Those two methods along with paying careful attention to minimizing the number of calls needed to your database can give you over a 10x performance improvements without a lot of extra work.
Again, a full working Django project with all of this code and unit tests can be found on GitHub in the dango_bulk_tutorial repository and the previous post Part 1: Efficient Bulk Creates with Django Rest Framework. I hope you enjoyed this post, and be sure to follow me for more articles on Django, Python, DevOps, machine learning and Tinyml. Do you have tricks for optimizing Django that you like to use? Leave a comment letting me now or provide feedback on the article.