Some window functions

LAG definition

LAG(return_value [, offset[, default_value]]) returns the value offset steps before the current value in order, or the default Alice, Bob, Carol, David, Eve, Eve, Frank

  • LAG('David') - 'Carol'
  • LAG('David', 2) - 'Bob'
  • LAG('Bob', 2, 'Nobody') - 'Nobody'

LEAD definition

Mirror function of LAG

RANK definition

Returns the position of the current row in the order, starting from 1, same values get the same rank, ranks can be skipped, i.e. Alice=1, Bob=2, Carol=3, Devid=4, Eve=5, Eve=5, Frank=7

DENSE_RANK definition

Same as RANK but does not skip ranks

ROW_NUMBER definition

Self explanatory

NTILE definition

NTILE(n) divides the result set into approximately equal-sized buckets, useful for defining quartiles, quintiles, deciles, etc.


Views

View definition

View is a virtual relation created from an existing relation Using (virtual) views in FROM clause is convenient, but not efficient, it is translated into a subquery!

Virtual viewMaterialized view
Plug-in view into a querySave view result as a relation
Runtime efficiencySlowFast, but also depends on indices
Memory efficiencyNo data is savedDuplicates data, some views are larger than the original relations
Stale dataNo, always up-to-datePossible, requires offline updates
Where is it used?Most databasesData warehouses

Common table expressions (CTEs)

WITH clause (CTEs) definition

CTE is defined by WITH clause, and is similar to view, but there are some differences Syntax - WITH [RECURSIVE] name AS (result set)

CTEVritual viewMaterialized view
Use caseDefining and plugging-in query within queryPlugging-in view into querySave view result as a relation
ScopeQuerySchemaA different database, data warehouse
Similar toLocal function or closure”Regular” functionSeparate instance with copied fields
Given a relation
  • ParentChild(parent, child) Return, for each person, all of their ancestors. It is clear, that with an unknown number of ancestry depth, we must use recursion Is it possible in SQL? With CTEs - yes!
WITH RECURSIVE Ancestry(person, ancestor) AS (
	-- base case
	SELECT child AS person, parent AS ancestor
	FROM   ParentChild
	
	UNION ALL
	
	-- recursive call, including stopping condition (empty JOIN)
	SELECT Ancestry.person, ParentChild.parent
	FROM   Ancestry
		   JOIN ParentChild ON Ancestry.anscestor = Parentchild.child
)
SELECT *
FROM   Ancestry

Another example, Calculate

WITH RECURSIVE Factorial AS (
	SELECT 1 AS n, 1 AS value
	
	UNION ALL
	
	SELECT n+1 AS n, value*(n+1) AS value
	FROM   Factorial
	WHERE  n < 5
)
SELECT n, value AS 'n!'
FROM   Factorial

Another example,

  • Employee(id, name, manager_id) Compute the employee hierarchy level (boss is on the first level)
WITH RECURSIVE EmployeeHierarchy AS (
		SELECT id, name, manager_id, 1 AS level
		FROM   Employee
		WHERE  manager_id IS NULL
		
		UNION ALL
		
		SELECT e.id, e.name, e.manager_id, e.level+1 AS level
		FROM   Employee AS e
			   JOIN EmployeeHierarchy AS eh
				   ON e.manager_id = eh.id
)
SELECT id, name, level
FROM   EmployeeHierarchy

Another example,

  • Employee(id, name, manager_id) How many employees are under management of each employee?
WITH RECURSIVE EmployeeHierarchy AS (
	SELECT id, name, manager_id, 0 AS countEmployees
	FROM   Employee
	WHERE  id NOT IN (SELECT DISTINCT manager_id
					  FROM            Employee)
	
	UNION ALL
	
		SELECT   e.id, e.name, e.manager_id,
				 COUNT(*)+SUM(countEmployees) AS countEmployees
		FROM     Employee AS e
			     JOIN EmployeeHierarchy AS eh
				     ON eh.manager_id = e.id
		GROUP BY e.id, e.name, e.manager_id
)
SELECT id, name, countEmployees
FROM   EmployeeHierarchy