Delete from B+ tree
- Find correct leaf L
- Remove data entry from L
- If L has at least
entries remaining, we’re done - Else, L has exactly
entries - First, check if we can borrow from adjacent siblings
- If we can’t, merge two siblings, resulting in a node with
entries - Update parent node accordingly
- This can happen recursively, if so, we continue borrowing/merging
Cost of using B+ tree
Search
In the I/O model,
- If index is in memory, search costs 0
- Otherwise,
- One node is typically one page, reading/writing costs 1, searching inside the node is 0 once in memory
- If one node is not a page, we count number of pages per node read/write
- Nodes that cannot be fully loaded into memory are impractical
Updates
- Best/Average case - updating only one node, so 1
- Worst case - (read both siblings + merge with sibling)
(depth - 1) - There exists ‘bad’ sequences of updates where worst case delete and insert happen frequently, but these are rare in practice
Execution engine
flowchart TD user(App/User) compiler[Query compiler/optimizer] engine[Execution engine] index[Index/Record manager] buffer[Buffer manager] transaction[Transaction Manager] storage[(Storage)] user -- Transaction commands --> transaction user -- Query / DML / DDL --> compiler compiler -- Query execution plan --> engine engine -- Record, Index requests --> index index -- Page commands --> buffer buffer -- Read/write pages --> storage transaction --> engine transaction --> buffer
Input of the execution engine is a query plan:
- Logical tree
- Implementation choice of each node
- Scheduling, memory
Each operation is executed, its output is passed on
Relational algebra
Relational algebra is a formal way of creating relations from other relations There are two types of operators - Unary and Binary We can combine operators by using a Formula (defined recursively), which is in turn equivalent to a Logical tree
- There are 5 basic operators:
- Union:
- Difference:
- Selection:
- Projection:
- Cartesian product:
- Union:
- And derived or auxiliary operators:
- Intersection
- Joins (natural, equi-join, theta join, …)
- Renaming:
- …
In addition to that, there are two ways of treating relations formally - bag and set semantics Bag semantics treat relations as multisets, so they preserve duplicates Set semantics treat relations as sets, so they automatically eliminate any duplicates
Union, difference and intersection
Selection and Projection
- Selection
- return only rows of that fulfill the condition is the condition for selection, which is composed of atomic formulas with and logical expressions over such formulas using - e.g.
- Bag and set semantics are the same
- Projection
- return only columns of is a subset of columns of - e.g.
- Set semantics may lead to less tuples in the result
Cartesian product and joins
- Cartesian product
- return all pairs of tuples in relations - Theta join
- Equi-join - a theta join where
is an atomic formula in the form of , e.g. - Natural join - a theta join, where
, e.g.
- Natural join - a theta join, where
Renaming
- Renaming attributes, e.g.
renames all to - Renaming relations, e.g.
renames relation to
Operations on bags also lead to another useful operator
- Distinct (duplicate elimination) -
Executing operations
General scheme
For the relational algebra operators, for each implementation:
- We state the necessary assumptions
- Data is saved on external memory
- Indices
- Memory size
- Data ordering and/or distribution
- Then we estimate the cost of the operation in the I/O model
- Answers are usually up to
due to data organization
- Answers are usually up to
- We usually ignore the cost of writing the output
Notations
- size in pages (blocks) - number of records - number of distinct values in the column - relation - output - index
Distinct
- Cost of the following is the cost of the full scan
- If memory size
is large enough, hold the search structure in memory (e.g. hash table), for each tuple in relation add it to the structure and output only if it wasn’t seen before, write to disk when output page is full. - Otherwise, if data is sorted, read the relation and write only the first copy of each tuple.
- If memory size
- Otherwise, one can always sort the data, the cost is then the cost of sorting the data
For some operations, we can compute DISTINCT on-the-fly
Union, difference, intersection
- Bag union - cost of reading two relations
- Set union - cost of reading two relations + distinct
- Set difference
- If memory is large enough, keep
in a search structure, proceed with as in set union - Otherwise, if data is sorted, read both relations in parallel to determine which tuples should be ignored
- If memory is large enough, keep
- Set intersection - an exercise…
Selection
Without an index, cost of selection is clearly
- 0 if index is in memory
- 1 for a hash index
- 2-4 read for a B+ tree index
If the index is clustered, all relevant records are adjacent, we can read them in
Example
Execute
-
-
-
50% of records fulfill the selection condition
-
If there is no index on price column and the records are unordered, the cost is
-
If there is a sorted clustered index on price, size of the lowest level of
is 100, the cost is then index search + -
If the same index is not clustered, the cost is index search + entry reading +
- Using such an index is not helpful in the worst case
Reduction factor
The size of the output largely affects the cost of operations, the implementation choice can greatly benefit from knowing, or at least estimating this size
The DBMS keeps statistics about the values of columns/indices
Assuming values of some column are uniformly distributed, for
The reduction factor is determined by the operation and statistics available The same factor is used to compute the relevant fraction of relation pages, relation tuples and data entries in the index
Projection
- Bag projection - simply cost of the full scan
- Set projection - cost projection + distinct
- If using a search structure, keep only the remaining fields, cost of projection is then included in that of distinct
Using an index
- If index contains projected attributes and is dense - no need to read records at all, cost is the size of the index in pages,
Executing joins
Implementing cartesian product
A naive implementation is a nested loop, for each record in
Block nested loop join
Instead of reading one page at a time, we can read
Index nested loop join
Use an index to fetch only relevant pairs
Assume there is an index on
If index is not clustered, the cost is
Possible improvements are reading pages of
Sort-merge join
If
- Read a page from
and from - Output all the matches
- If the last
is smaller than last , read a page of , otherwise read a page of - Continue until reaching the end of
or Cost of a Sort-Merge join is then the cost of sorting + cost of merge We assume the cost of merge to be , then the total cost is where is the number of passes in the sort, which is usually 2 We can pass the outputs of the last sorting pass directly to the merging, saving one read + write of and , depending on available memory. This is called pipelining
Hash join
First, partition both
The only limitation is that
| Algorithm | Assumptions | Advantages |
|---|---|---|
| Nested loop join | No assumptions | Naive, efficient if one relation is very small or the result is close to Cartesian product |
| Block nested loop join | No assumptions | Better than nested loop join |
| Index nested loop join | Index on one relation | Usually better than nested loop if index is clustered or there are few matches per tuple |
| Sort-merge join | Each tuple has relatively few matches | Very efficient if data is sorted, result is sorted, usually more efficient than loops (linear) |
| Hash join | Data can be split into small enough partitions | Usually the most efficient (linear) |