Rank vs Dense Rank:
Introduction
In this article, you will learn about RANK, DENSE_RANK and ROW_NUMBER functions in SQL Server.
Explanation
- The RANK, DENSE_RANK and ROW_NUMBER functions are used to get the increasing integer value, based on the ordering of rows by imposing ORDER BY clause in SELECT statement.
- When we use RANK, DENSE_RANK or ROW_NUMBER functions, the ORDER BY clause is required and PARTITION BY clause is optional.
- When we use PARTITION BY clause, the selected data will get partitioned, and the integer value is reset to 1 when the partition changes.
Let’s see an example.
For this example, I have created a table called Employees in which I have added three columns EMPID, Name, and Salary respectively. To create table and column, the script is as given below.
Creating a table
- CREATE TABLE [dbo].[Employee](
- [EMPID] [nvarchar](30) NOT NULL,
- [Name] [nvarchar](150) NULL,
- [Salary] [money] NULL
- )
Insert some data into this table, script as follows
- INSERT [dbo].[Employee] ([EMPID], [Name], [Salary]) VALUES ('EMP101', 'Vishal', 15000)
- INSERT [dbo].[Employee] ([EMPID], [Name], [Salary]) VALUES ('EMP102', 'Sam', 20000)
- INSERT [dbo].[Employee] ([EMPID], [Name], [Salary]) VALUES ('EMP105', 'Ravi', 10000)
- INSERT [dbo].[Employee] ([EMPID], [Name], [Salary]) VALUES ('EMP106', 'Mahesh', 18000)
The table is created and data is inserted. Now, let’s see how these about RANK, DENSE_RANK and ROW_NUMBER functions behave. For this, we can write Select statement as shown below.
- select EMPID, Name,Salary,
- RANK() over (order by Salary desc) as _Rank,
- DENSE_RANK () over (order by Salary desc) as DenseRank ,
- ROW_NUMBER() over (order by Salary desc) as RowNumber from Employee
You can see the below output that RANK, DENSE_RANK and ROW_NUMBER functions give the same output and returning integer values based on highest salary.
I am assuming you have got what RANK, DENSE_RANK and ROW_NUMBER functions does.
Now, let’s see what is the difference between those functions. For this, we need to insert some duplicate salary into this table. The script below will insert some more data with same salary.
- INSERT [dbo].[Employee] ([EMPID], [Name], [Salary]) VALUES ('EMP108', 'Rahul', 20000)
- INSERT [dbo].[Employee] ([EMPID], [Name], [Salary]) VALUES ('EMP109', 'menaka', 15000)
- INSERT [dbo].[Employee] ([EMPID], [Name], [Salary]) VALUES ('EMP111', 'akshay', 20000)
Now, you can see the table which has got some duplicate salary records.
Now, we can see the different between RANK, DENSE_RANK and ROW_NUMBER functions by executing the same SELECT statement which we executed before.
Here, in above figure, you can see that ROW_NUMBER function does not have any changes. It keeps increasing integer by one and it is not caring about duplicate values.
In RANK, DENSE_RANK function, it is looking for duplicate values. The integer value is increasing by one but if the same value (Salary) is present in the table, then the same integer value is given to all the rows having the same value(Salary), as marked in sky blue color.
In RANK function, the next row after the duplicate values (salary),marked in red color, will not give the integer value as next rank but instead of it, it skips those ranks and gives what is the next incremented rank. In the above case, the first three values are having same salary so it gives same rank to them but in next row, it gives as 4, It skips two and three as first three rows have same ranks.
In DENSE_RANK function, it will not skip any rank. This means the next row after the duplicate value (salary) rows will have the next rank in the sequence.
Some of the use of these functions in real-time
- Using ROW_NUMBER function, we can delete duplicate data from table. For this, we need to insert some duplicate records into this table.
In order to remove all duplicate rows, we need to keep one row and remaining rows we need to delete. For this, we are using ROW_NUMBER function as it will be easy.
- with empCTE as
- (
- select *, ROW_NUMBER() over(partition by EMPID order by EMPID) as rowno from Employee
- )
- delete from empCTE where rowno>1
In this code, I have written Select statement to get a rank and I made as a CTE. After that, I am deleting records from CTE where row number is greater than one so that one record remains.
As you can see in the above figure, the duplicate rows are deleted.
- Using RANK or DENSE_RANK function, we can find Nth highest salary very easily, however which function to use depends on what we need.
For this, let’s update Ravi’s salary same as Sam. The script is as follows
- update Employee set Salary=20000 where EMPID='EMP105'
Now, we have two first height salary records. Let’s write a Select statement to find second highest salary using RANK or DENSE_RANK function as shown below.
As you can see above, the DENSE_RANK function is giving an accurate result and RANK function is not returning any row. So, we can use DENSE_RANK Function even if there is a same salary .
Comments
Post a Comment