Window Functions in SQL: Everything you need to know
What is Window Function in SQL ? Let's explore how to use window functions to conduct statistical analysis with just one SQL query .
The versatility of SQL as a DBMS query language has been increasingly used over the years. Its extensions and flexibility make it a favorite for every data analyst.
There are a number of advanced functions available in addition to the regular SQL functions. These functions are often called window functions. If you're dealing with complex data and want to perform advanced calculations, you can use them to get the best out of your data.
The importance of window function
Several window functions are available in SQL. Each window function will help you perform a series of calculations. From creating sections to ranking rows or assigning row numbers. These window functions can do a little bit of everything.
Window functions are useful when you apply aggregate functions on a specific set of data or collection of rows. These functions go beyond the aggregate functions that GROUP By provides. However, that is the main difference, unlike the group function, your data is not combined into a single row.
You cannot use window functions in WHERE, FROM and GROUP BY statements.
Window Function formula
When referencing any window function, you need to follow the default syntax structure for it to run correctly. If the command structure is wrong, you will get an error and cannot run the code.
This is the default syntax:
SELECT columnname1, {window_function}(columnname2) OVER([PARTITION BY columnname1] [ORDER BY columnname3]) AS new_column FROM table_name;
Specifically:
- coulmnname1 is the first column name you want to select.
- {window_function} is the name of an aggregate function such as sum, avg, count, row_number, rank, or dense_rank.
- columnname2 is the name of the column to which you apply the window function.
- columnname3 is the third column name, which will form the basis for the partition.
- new_column is the label for the new column that you can apply using the AS keyword.
- table_name is the name of the source table.
Window functions are different from some basic SQL commands. Unlike SQL aggregate functions, you can use window functions to implement advanced functions.
Prepare dataset
You can use the CREATE TABLE command to create a new table in SQL. Below is a sample dataset that this tutorial will use to define some window functions:
Order Date | Category | Color | Sale Price | Quantity |
---|---|---|---|---|
November 8, 2016 | Phones | Black | 907.152 | 6 |
June 12, 2016 | Binders | Green | 18,504 | 3 |
October 11, 2015 | Appliances | Yellow | 114.9 | 5 |
October 11, 2015 | Tables | Brown | 1706.184 | 9 |
June 9, 2014 | Phones | Red | 911.424 | 4 |
June 9, 2014 | Paper | White | 15,552 | 3 |
June 9, 2014 | Binders | Black | 407,976 | 3 |
June 9, 2014 | Appliances | Yellow | 68.81 | 5 |
June 9, 2014 | Binders | Green | 2,544 | 3 |
June 9, 2014 | Storage | Orange | 665.88 | 6 |
June 9, 2014 | Storage | Orange | 55.5 | 2 |
April 15, 2017 | Phones | Black | 213.48 | 3 |
December 5, 2016 | Binders | Green | 22.72 | 4 |
November 22, 2015 | Appliances | Green | 60.34 | 7 |
November 22, 2015 | Chairs | Dark Brown | 71,372 | 2 |
May 13, 2014 | Furniture | Orange | 190.92 | 5 |
Explain in detail the Sum function
Suppose you want to calculate the total sales for each value in a directory column. Here's how you can do this:
SELECT category, color, sum(sale_price) OVER (order by category) AS total_sales FROM sahil.sample;
In the code above, the SQL command retrieves categories and colors from the original dataset. The sum function adds the sale_price column. It does this by category because the OVER clause determines the order by category column. The final result is as follows:
How to use window function Avg()
Like the sum function, you can average each row of data using the avg function. Instead of a total, you'll have a column containing average revenue.
SELECT category, color, avg(sale_price) OVER (order by category) AS avg_sales FROM sahil.sample;
How to use window function Count()
Similar to the sum and avg functions, the window function count in SQL is quite simple and works like the other two functions. When you switch to the count function, you will get the total number of each value in the new column.
Here's how you can calculate the total:
SELECT category, color, count(category) OVER (order by category) AS item_count FROM sahil.sample;
Window Function Row_Number()
row_number() works a little differently than the window functions mentioned above. The row_number() function assigns a row number to each row, depending on the order of the clause. The starting row number is 1. row_number assigns a corresponding value to each row until the end.
Here is the basic structure of a row_number() function:
SELECT category, color, row_number() OVER (order by category) AS item_number FROM sahil.sample;
But what happens if you want to assign separate row numbers to each item in the catalog? The above syntax sets a rotating serial number, regardless of the items stored in the catalog. For example, equipment lists need to be numbered separately…
You can use the partition function to perform this simple but practical task. The partition keyword assigns a specified number of rows to each item in the catalog.
SELECT category, color, row_number() OVER (partition by category order by category) AS item_number FROM sahil.sample;
Rank() and Dense_Rank() functions
The rank() function works differently than the row_number() function. You need to specify the column name in order by function, to use it as the basis for determining the function value. For example, in the code below, you can use the color column in the order by function. This query then uses that order to assign a ranking value to each row.
You can use the code syntax below to pass a ranking function in SQL:
SELECT category, color, rank() OVER (order by color) AS item_rank FROM sahil.sample;
Result:
The order by function classifies the color folder, while the rank function ranks each color. However, all similar color values share the same rating, while other colors have their own ratings. Black appears 3 times in the dataset; Instead of assigning a rating value of 1, 2, 3, black items are rated 1.
However, Brown would be 4, not 2. The Rank function ignores the values and assigns chronological values to different items. If you want to assign a more meaningful ranking value, you can use the dense_rank() function.
The dense_rank function does not ignore any rank value in the order by function. For example, the first 3 color entries will have rank 1. However, the next color (Brown) will not have rank 4, but rank 2, which is the next chronological order in the numbered category. The dense_rank function is a more practical window function because it assigns a meaningful value to all list items.
Here's how you can use the dense_rank function in SQL:
SELECT category, color, dense_rank() OVER (order by color) AS item_rank FROM sahil.sample;
Result:
Above is everything you need to know about Windows Functions in SQL . Hope the article is useful to you.
You should read it
- Print pages in JavaScript
- Please download more than 90 beautiful windows wallpapers for your computer
- Standard window size
- Save time with these text formatting functions in Microsoft Excel
- Function in programming C
- DAY function in SQL Server
- MIN function in SQL Server
- MAX function in SQL Server
- How to open the CS 1.1 command window
- How to keep the working Windows window always on top, pin the window on the Desktop
- SUM function in SQL Server
- The ord () function in Python
Maybe you are interested
Difference between function and formula in Excel
8 little-known Excel functions that can save you a lot of work
How to use the NORMDIST function in Excel - Function that returns the distribution in Excel
Date functions in Excel, DAY, WEEKDAY, MONTH
How to use the SUMIF function in Excel to calculate the sum based on conditions
How to use the Round function in Excel to round numbers and process data