Delete from B+ tree

Find correct leaf L
Remove data entry from L
If L has at least entries remaining, we’re done
Else, L has exactly entries
- First, check if we can borrow from adjacent siblings
- If we can’t, merge two siblings, resulting in a node with entries
- Update parent node accordingly
- This can happen recursively, if so, we continue borrowing/merging

Cost of using B+ tree

Search

In the I/O model,

If index is in memory, search costs 0
Otherwise,
- One node is typically one page, reading/writing costs 1, searching inside the node is 0 once in memory
- If one node is not a page, we count number of pages per node read/write
- Nodes that cannot be fully loaded into memory are impractical

Updates

Best/Average case - updating only one node, so 1
Worst case - (read both siblings + merge with sibling) (depth - 1)
- There exists ‘bad’ sequences of updates where worst case delete and insert happen frequently, but these are rare in practice

Execution engine

flowchart TD

user(App/User)
compiler[Query compiler/optimizer]
engine[Execution engine]
index[Index/Record manager]
buffer[Buffer manager]
transaction[Transaction Manager]
storage[(Storage)]

user -- Transaction commands --> transaction
user -- Query / DML / DDL --> compiler
compiler -- Query execution plan --> engine
engine -- Record, Index requests --> index
index -- Page commands --> buffer
buffer -- Read/write pages --> storage

transaction --> engine
transaction --> buffer

Input of the execution engine is a query plan:

Logical tree
Implementation choice of each node
Scheduling, memory

Each operation is executed, its output is passed on

Relational algebra

Relational algebra is a formal way of creating relations from other relations There are two types of operators - Unary and Binary We can combine operators by using a Formula (defined recursively), which is in turn equivalent to a Logical tree

There are 5 basic operators:
- Union:
- Difference:
- Selection:
- Projection:
- Cartesian product:
And derived or auxiliary operators:
- Intersection
- Joins (natural, equi-join, theta join, …)
- Renaming:
- …

In addition to that, there are two ways of treating relations formally - bag and set semantics Bag semantics treat relations as multisets, so they preserve duplicates Set semantics treat relations as sets, so they automatically eliminate any duplicates

Union, difference and intersection

Selection and Projection

Selection - return only rows of that fulfill the condition
- is the condition for selection, which is composed of atomic formulas with and logical expressions over such formulas using
- e.g.
- Bag and set semantics are the same
Projection - return only columns of
- is a subset of columns of
- e.g.
- Set semantics may lead to less tuples in the result

Cartesian product and joins

Cartesian product - return all pairs of tuples in relations
Theta join
Equi-join - a theta join where is an atomic formula in the form of , e.g.
- Natural join - a theta join, where , e.g.

Renaming

Renaming attributes, e.g. renames all to
Renaming relations, e.g. renames relation to

Operations on bags also lead to another useful operator

Distinct (duplicate elimination) -

Executing operations

General scheme

For the relational algebra operators, for each implementation:

We state the necessary assumptions
- Data is saved on external memory
- Indices
- Memory size
- Data ordering and/or distribution
Then we estimate the cost of the operation in the I/O model
- Answers are usually up to due to data organization
We usually ignore the cost of writing the output

Notations

- size in pages (blocks)
- number of records
- number of distinct values in the column
- relation
- output
- index

Distinct

Cost of the following is the cost of the full scan
- If memory size is large enough, hold the search structure in memory (e.g. hash table), for each tuple in relation add it to the structure and output only if it wasn’t seen before, write to disk when output page is full.
- Otherwise, if data is sorted, read the relation and write only the first copy of each tuple.
Otherwise, one can always sort the data, the cost is then the cost of sorting the data

For some operations, we can compute DISTINCT on-the-fly

Union, difference, intersection

Bag union - cost of reading two relations
Set union - cost of reading two relations + distinct
Set difference
- If memory is large enough, keep in a search structure, proceed with as in set union
- Otherwise, if data is sorted, read both relations in parallel to determine which tuples should be ignored
Set intersection - an exercise…

Selection

Without an index, cost of selection is clearly The cost of searching the index is

0 if index is in memory
1 for a hash index
2-4 read for a B+ tree index

If the index is clustered, all relevant records are adjacent, we can read them in If the index is not clustered, it usually is a B+ tree, data entries are sorted, so we can read the relevant records in up to if each output tuple is in a different page, the improvement is to not read the same page twice, which gives us

Example

Execute

50% of records fulfill the selection condition
If there is no index on price column and the records are unordered, the cost is
If there is a sorted clustered index on price, size of the lowest level of is 100, the cost is then index search +
If the same index is not clustered, the cost is index search + entry reading +
- Using such an index is not helpful in the worst case

Reduction factor

The size of the output largely affects the cost of operations, the implementation choice can greatly benefit from knowing, or at least estimating this size The DBMS keeps statistics about the values of columns/indices Assuming values of some column are uniformly distributed, for , , similarly What can we do about ? We will see this in equi-joins For , we assume are known, then

The reduction factor is determined by the operation and statistics available The same factor is used to compute the relevant fraction of relation pages, relation tuples and data entries in the index

Projection

Bag projection - simply cost of the full scan
Set projection - cost projection + distinct
- If using a search structure, keep only the remaining fields, cost of projection is then included in that of distinct

Using an index

If index contains projected attributes and is dense - no need to read records at all, cost is the size of the index in pages,

Executing joins

Implementing cartesian product

A naive implementation is a nested loop, for each record in , read all records in and write all pairs, cost is then It is clear then, that the outside loop should be run over a smaller relation, here we assumed A nested loop join is then identical, except that only pages that match condition are written to the output A key observation here is that we don’t use the most of available main memory, which leads to the following idea

Block nested loop join

Instead of reading one page at a time, we can read pages from at once and then compare each page of to these pages The cost then becomes A key observation here is that the size of a join is usually much smaller than that of a cartesian product, maybe we can find relevant pairs faster? How?

Index nested loop join

Use an index to fetch only relevant pairs Assume there is an index on (column in ) and we want to compute For each tuple of , we find matching tuples of using the index, read them and write pairs The cost is then reading + search per tuple of If index is clustered, the cost is

If index is not clustered, the cost is

Possible improvements are reading pages of per page of instead of “per tuple” and a block index nested loop join

Sort-merge join

If and are sorted by relevant columns, finding matches for an equi-join is easy, similar to merge in external merge sort Execution of

Read a page from and from
Output all the matches
If the last is smaller than last , read a page of , otherwise read a page of
Continue until reaching the end of or Cost of a Sort-Merge join is then the cost of sorting + cost of merge We assume the cost of merge to be , then the total cost is where is the number of passes in the sort, which is usually 2 We can pass the outputs of the last sorting pass directly to the merging, saving one read + write of and , depending on available memory. This is called pipelining

Hash join

First, partition both and using a hash function . Clearly, tuples in can only match tuples in Here, in order to be able to do this in one pass, meaning that each page is read only once, there can be no more than partitions. Then, for each partition , for each page in write matching pairs. This step assumes . If this is not fulfilled, partition larger partitions again! The cost of a hash join (assuming one partition step) is just

The only limitation is that , which is usually the case, as just 1MB of memory (256 4KB pages) is enough to hash 256GB of data

Algorithm	Assumptions	Advantages
Nested loop join	No assumptions	Naive, efficient if one relation is very small or the result is close to Cartesian product
Block nested loop join	No assumptions	Better than nested loop join
Index nested loop join	Index on one relation	Usually better than nested loop if index is clustered or there are few matches per tuple
Sort-merge join	Each tuple has relatively few matches	Very efficient if data is sorted, result is sorted, usually more efficient than loops (linear)
Hash join	Data can be split into small enough partitions	Usually the most efficient (linear)

Cub11k's BIU Notes

Explorer

Databases 10