|
R = (X, Y)
S = (Y, Z)
S has an index on Y
|
while ( R has data blocks )
{
read a block of R in b;
for ( each tuple tr ∈ b ) do
{
Use tr(Y) to lookup in index S.Y;
// You get a list of record addresses
for ( each record address s ) do
{
read tuple at s;
output tr, ts; // Join result
}
}
}
|
|
B(S)
portion of S read per tuple of R = -------- blocks
V(S,Y)
|
because:
|
Index Join algorith using a clustering index:
# disk IOs = Scan R once
+ # tuples in R × # blocks of S read per tuple of R
B(S)
= B(R) + T(R) × --------
V(S,Y)
|
|
|
T(S)
portion of S read per tuple of R = -------- blocks
V(S,Y)
|
because:
|
Index Join algorith using a clustering index:
# disk IOs = Scan R once
+ # tuples in R × # blocks of S read per tuple of R
T(S)
= B(R) + T(R) × -------- (assuming different tuples
V(S,Y) are located in diff. blocks)
|
|
relation R(X,Y) relation S(Y,Z)
--------------------------------------------
B(R) = 1000 blks B(S) = 500 blocks
T(R) = 10000 tuples T(S) = 5000 tuples
V(S,Y) = 100
|
|
# disk IO used by the index Join algorithm:
B(S)
# disk IO ~= T(R) × -------
V(S,Y)
500
= 10000 × ------
100
= 50,000 blocks
|
# disk IO used by the one-pass join algorithm
# disk IO = B(R) + B(S) (See: click here) = 1000 + 500 = 1500 blocks |
|
relation R(X,Y) relation S(Y,Z)
--------------------------------------------
B(R) = 1 blks B(S) = 500 blocks
T(R) = 10 tuples T(S) = 5000 tuples
V(S,Y) = 100
|
|
# disk IO used by the index Join algorithm:
B(S)
# disk IO ~= T(R) × -------
V(S,Y)
500
= 10 × ------
100
= 50 blocks
|
# disk IO used by the one-pass join algorithm based on TPMMS
# disk IO = B(R) + B(S) (See: click here) = 1 + 500 = 501 blocks |
|
|