Overview of the SQL ROW_NUMBER function
ROW_NUMBER() is a function in SQL Server that assigns a unique number to each row within a result set, based on the order specified in the ORDER BY clause of the ROW_NUMBER() function. The function is typically used in conjunction with the SELECT statement and is commonly used for pagination, ranking, and other types of data analysis. The syntax for using the ROW_NUMBER() function is as follows:
SELECT ROW_NUMBER() OVER (ORDER BY column_name) as row_number, column1, column2, ...
FROM table_name
The OVER clause is used to define the window or partition of
the result set to which the ROW_NUMBER() function applies, and the ORDER BY
clause is used to specify the order in which the row numbers are assigned. The
ROW_NUMBER() function can also be used with the PARTITION BY clause to generate
unique numbers for each partition of the result set.
SELECT ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2) as row_number, column1, column2, ...
FROM table_name
This will assign unique numbers to each rows within the same
partition.
Here are some examples of using ROW_NUMBER() in SQL Server:
1. Assign a unique number to each row in a result set:
SELECT ROW_NUMBER() OVER (ORDER BY column1) as RowNumber, column1, column2
FROM table_name
2. Assign a unique number to each row within a group:
SELECT ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2) as RowNumber, column1, column2
FROM table_name
3. Assign a unique number to each row within a group, and also include a custom prefix:
SELECT 'PREFIX_' + CAST(ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2) as varchar(10)) as RowNumber, column1, column2
FROM table_name
4. Assign a unique number to each row within a group, and also include a custom prefix and reset the numbering for each group:
SELECT column1, 'PREFIX_' + CAST(ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2) as varchar(10)) as RowNumber, column2
FROM table_name
5. Finding duplicate rows: By assigning a unique number to each row based on the columns that you want to check for duplicates, you can filter out duplicate rows.
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY column1) as row_number
FROM
table_name
)
SELECT column1, column2
FROM CTE
WHERE row_number = 1;
This query will return only one row for each set of
duplicate values of column1 and column2.
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY column1) as row_number,
SUM(column2) OVER (ORDER BY column1) as running_total
FROM
table_name
)
SELECT column1, column2, running_total
FROM CTE;
This query will return the column1,column2 and the running
total of column2 in the order of column1
SELECT column1, column2, ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2) as row_number
FROM table_name
GROUP BY column1, column2;
This query will return the column1, column2 and the unique
row number for each group of column1
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY column1) as row_number
FROM
table_name
)
SELECT column1, column2
FROM CTE
WHERE row_number BETWEEN 5 AND 10;
9. Generating a running average: By using the ROW_NUMBER() function in conjunction with the AVG() function and a CTE, you can generate a running average for a specific column.
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY column1) as row_number,
AVG(column2) OVER (ORDER BY column1 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as running_average
FROM
table_name
)
SELECT column1, column2, running_average
FROM CTE;
10. Extracting a specific number of rows from a query: By using the ROW_NUMBER() function in conjunction with the TOP clause, you can extract a specific number of rows from a query.
WITH CTE AS (
SELECT TOP 100 ROW_NUMBER() OVER (ORDER BY column1) as row_number, column1, column2
FROM
table_name
)
SELECT column1, column2
FROM CTE;
11. Filtering out the top N rows based on a specific column: By using the ROW_NUMBER() function in conjunction with the TOP clause and the ORDER BY clause, you can filter out the top N rows based on the values of a specific column.
WITH CTE AS (
SELECT TOP 10 ROW_NUMBER() OVER (ORDER BY column1 DESC) as row_number, column1, column2
FROM
table_name
)
SELECT column1, column2
FROM CTE
WHERE row_number = 1;
12. Generating a cumulative sum: By using the ROW_NUMBER() function in conjunction with the SUM() function, you can generate a cumulative sum for a specific column.
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY column1) as row_number,
SUM(column2) OVER (ORDER BY column1 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as cumulative_sum
FROM
table_name
)
SELECT column1, column2, cumulative_sum
FROM CTE;
13. Generating a running maximum: By using the ROW_NUMBER() function in conjunction with the MAX() function, you can generate a running maximum for a specific column.
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY column1) as row_number,
MAX(column2) OVER (ORDER BY column1 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as running_max
FROM
table_name
)
SELECT column1, column2, running_max
FROM CTE;
14. Randomly selecting a specific number of rows from a table: By using the ROW_NUMBER() function in conjunction with the NEWID() function, you can randomly select a specific number of rows from a table.
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY NEWID()) as row_number
FROM
table_name
)
SELECT column1, column2
FROM CTE
WHERE row_number BETWEEN 1 AND 100;
15. Generating a running median: By using the ROW_NUMBER() function in conjunction with the NTILE() function, you can generate a running median for a specific column.
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY column1) as row_number,
)
SELECT column1, column2, median
FROM CTE;
NTILE(2) OVER (ORDER BY column2) as median
FROM table_name
16. Generating a running standard deviation: By using the ROW_NUMBER() function in conjunction with the STDEV() function, you can generate a running standard deviation for a specific column.
WITH CTE AS (
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY column1) as row_number,
STDEV(column2) OVER (ORDER BY column1 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as running_stdev
FROM
table_name
)
SELECT column1, column2, running_stdev
FROM CTE;