Pagination with ElasticSearch using NodeJS
Hey guys, today I’m going talk about what was my strategy on backend to paginate using Elasticsearch.
Reason:
I was assigned a task to develop a page that would contain more than 500 products on it and on these pages, there would be a lot of product data in addition to many product images, and even a product could contain more than one image.
In this application, we were using Elasticsearch, so initially I wanted to check what it would be like to have a page loading all products without pagination.
To exemplify, an item would be sort like this object:
{ "id": 123,"image": ["http://...",http://...,http://...], "base_produto": "", "categoria": "", "cor_produto": "", "data_primeira_venda": "2100-12-31", "desc_cor_produto": "", "desc_produto": "", "distribuicao": false, "programacoes": [ { "nome_programacao": "", "qtde_entregue": 0, "qtde_programada": 220 } ] ...}
So I tested the page loading all 500 products at once, this was the result:
As can be seen, the page is not performing at all and loading takes extremely long, that's because it's very high load to the browser to render 500 hundred items and a lot of Images
Ok, so we see that we REALLY need to paginate that data.
Now I’m going to present my strategy so that we can page the data by looking only at the backend side.
Here we use NodeJS and express to handle the routes.
The elastic search already has a way to deal with pagination and it's called SCROLL
the scroll basically has 2 main parameters:
scroll: scroll parameter tells Elasticsearch to keep the search context open for the period of time you are spending.
scroll_id: It is the reference that will be used to search for the next page of items.
example of a scroll POST:
POST /_search/scroll
{
"scroll" : "1m",
"scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}
So, basically on the backend we create a function that searches our items and receives two parameters: indexName and query.
indexName: name of the index where Elasticsearch will search.
query: the query we are using to filter the items and etc…
What is this searchData?
basically it is the endpoint of our elastic application.
( To better dive into this I recommend the proper documentation of the elastic: https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/index.html )
Now I’m going to show you a route that takes the first 10 items from the 500 hundred products that I had.
async getSingleCollectionProducts(req, res) { const { collection, id_marca_estilo, next_page } = await req.query; const payload = {
//HERE I`M JUST PUTTING TO ONLY RETURNS TO ME
// 10 ITEMS
from: 0, size: 10,
}, query: {
//here will enter your query, you can use your on query
// on kibana.
},};try { //indexName //Query
const resp = await searchData('produto_cor', payload);
//RETURNS TO ME AN ARRAY OF DATA FROM 0 TO 10 const produtos = []; // just creating an more useful array of datas
await resp.body.hits.hits.map(products => {
produtos.push({
id: products._id,
data: products._source,
});
}); const pro = {};
// because i used the parameter scroll on my searchData function,
// I have the scroll_id value pro.next_page = resp.body._scroll_id;
// I create a value on my object that has the value from the next
// page
pro.data = produtos; return res.status(200).send(pro); // return the value } catch (err) {
console.log('Error getting products from a single collection', err); return res.status(500).send(err);
}
}
When we hit this route The final result will be:
Pay attention, our scroll_id is now called next_page.
Now the page is extremely faster because we are only loading the first 10 items.
Our goal now is to load the next items.We have now the scroll_id that I called next_page who has the id for the next page of items that we want to render.
Now it’s the simplest part, we just need to create one more route that contains the scroll_id as a parameter, and Elasticsearch will take care of returning us the next data and also a new scroll_id containing the next items on the page.
Virtually the same as a route that was previously created, with only the detail that now instead of using searchData, we had to create a new function called scrollData that came with the elasticClient(https://github.com/elastic/elasticsearch-js)
( https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/scroll_examples.html )
Basically the scroll function receives two parameters:
scroll_id: String(id of the next page containing the next 10 items)
scroll: String(number of times that the id will exist)
After changing some things on the front end, to use my created routes we have our result:
ElasticSearch is an amazing tool with its own world, many doubts may appear along the way, but if you need help you can look for me, I hope I helped!