LAG functionality on SQL Data
An intermediate topic in SQL with the cosmos
“Mars is there, waiting to be reached.” — BUZZ ALDRIN
Traveling and vacationing on Mars — it’s the stuff of science fiction. Mars, a neighboring plant to Earth, has deep valleys and the tall mountains. One mountain is Olympus Mons — with a staggering height of 72,000 ft (two and a half times Mount Everest’s height above sea level). It is a shield volcano, much like the the large volcanoes making up the Hawaiian Islands.
Although Mars’ environment is hostile to humans (one characteristic is the amount of radiation received on the surface could pose health risks), I’m going to create a scenario that allows us to imagine vacationing on this red planet. Suppose we spent 4 years on a spacecraft traveling to the planet. Among seeing many sites on the planet, we are visiting Olympus Mons this week. At the mountain base, a food stand is selling astronaut ice cream and other freeze dried foods.
Today’s SQL post uses fake data for the purpose of showcasing the LAG
window function. Let’s begin.
Creating the data
Since this is a hypothetical scenario, I am going to create the monthly sales imaginary data for the food stand.
Here, I create the table named sales
:
CREATE TABLE sales (
months INT,
net_sales FLOAT
);
And, then I insert data into the table:
INSERT INTO sales(months, net_sales)
VALUES (1, 8964),
(2, 4212),
(3, 5618),
(4, 7319),
(5, 9282);
I finally view my work with a SELECT
statement:
So, there are 5 records of net_sales
, each for a month that the food stand is open.
LAG window function
Suppose I wanted to look at the net sales from the previous month while at the same time viewing the current month. This is possible with the LAG
window function. This allows us to access a row at a specific offset that comes before the current row. In terms of our example, if the current row is the 4th month
and thenet_sales
value is 7319, then with an offset of 1 I expect to see 5618 in the next column for that row. Similarly for the other records, we can expect the following.
Coding this up:
SELECT *
, LAG(net_sales, 1) OVER (ORDER BY months) "previous_month_sales"
FROM sales;
Two comments about this:
- The
LAG
function arguments are the returned value (here,net_sales
) and the offset (here, 1). - The
OVER
function indicates the window of data theLAG
function is applied. Here, we are applying the function over the entire dataset in oder of months.
Looking at the result of the query:
That’s nice. But, let’s take this one step further. Suppose we want to compare two month as the percent change of the sales vales. This can be done as well:
WITH comparison AS (
SELECT *
, LAG(net_sales, 1) OVER (ORDER BY months) "previous_month_sales"
FROM sales
)SELECT *
, CAST ((net_sales - previous_month_sales) / previous_month_sales
* 100 AS INT) "vs_previous_month"
FROM comparison;
This query is a little more complex and calculates a value from the table. Let’s break it down.
- First I need to define a temporary table (called
comparison
) that exists just for one query. This table allowsprevious_month_sales
to be defined as a column we can query against. (Otherwise, it’s just present in a table’s view.) - In the next query, I calculate the percent change, but
CAST
it as anINT
so that I have a clean and concise percent in my final table view.
Result:
Reading and interpreting this table, the 2nd month of sales dropped ~53% in comparison with the previous month. The net sales in the 3rd month increased by 33%, and so on.
That’s it for this fake data, folks! Another topic for your SQL toolbox.
For your reference, I have a series of SQL tutorials that cover basic and intermediate queries. Take a look!
- Black holes, planets, and SQL
- Touring the Solar System with SQL
- Searching for moons with SQL
- Classifying black holes with SQL
- Joining Constellations with SQL
- Solar System classifications with SQL
- Many (to many) relations among the stars
- Creating tables in SQL
- A window to the solar system
Like these SQL posts? Have a question? Let’s chat in the comments!