Tag Archives: Using join buffer (Block Nested Loop)

Using join buffer (Block Nested Loop)

Using join buffer (Block Nested Loop)

Table join algorithm of MSYQL

Nested loop join (NLJ) algorithm

NLJ algorithm: take the result set of the drive table/external table as the basic data of the loop, and then get data from the result set one at a time as the filter condition of the next table to query the data, and then merge the results. If there are multiple tables to join, the result set of the previous table will be taken as loop data, and each row will be taken to the next table of the join for loop matching, and the result set will be returned to the client

The pseudo algorithm of nested loop is as follows:

foreachrowint1matchingrange{
foreachrowint2matchingreferencekey{
foreachrowint3{
ifrowsatisfiesjoinconditions,
sendtoclient
}
}
}

Because the NLJ algorithm passes rows one at a time from outer loops to inner loops, tables processed in the inner loops typically are read many times

Because the ordinary nested loop only passes one row into the inner loop at a time, the number of rows in the outer loop (the result set) depends on the number of times the memory loop is executed. When there is an index on the inner table connection, the scanning cost is O (RN). If there is no index, the scanning cost is O (RN * SN). If there are many records in the internal table s, the simplenested loops join will scan the internal table many times, and the execution efficiency is very poor

Block nested loop join (BNL) algorithm

BNL algorithm: save the row/result set of the outer loop into the join buffer, and compare each row of the inner loop with the records in the whole buffer, so as to reduce the number of inner loops

For example, the result set of the outer loop is 100 rows. Using the NLJ algorithm, you need to scan the internal table 100 times. If you use the BNL algorithm, you first put the 10 rows of records read from the outer loop table (external table) into the join buffer, and then directly match the 10 rows of data in the innerloop table (internal table), and the memory loop can compare with the 10 rows at a time. In this way, you only need to compare 10 times, Scanning of internal tables is reduced by 9/10. So BNL algorithm can significantly reduce the number of inner loop table scanning

For the query described above, if you use the join buffer, the actual join diagram is as follows:

foreachrowint1matchingrange{
foreachrowint2matchingreferencekey{
storeusedcolumnsfromt1,t2injoinbuffer
ifbufferisfull{
foreachrowint3{
foreacht1,t2combinationinjoinbuffer{
ifrowsatisfiesjoinconditions,
sendtoclient
}
}
emptybuffer
}
}
}

ifbufferisnotempty{
foreachrowint3{
foreacht1,t2combinationinjoinbuffer{
ifrowsatisfiesjoinconditions,
sendtoclient
}
}
}

If the sum of the column lengths of T1 and T2 participating in the join is s, and C is the combination number of the two, then the number of times T3 table is scanned is

(S * C)/join_ buffer_ size + 1

The times of scanning T3 increased with the increase of join_ buffer_ If you increase the size of the join buffer, the query speed will not be faster until the join buffer can accommodate all the combinations of T1 and T2

There are the following points in using join buffer in MySQL:

1. join_ buffer_ The size variable determines the buffer size

2. The join buffer can only be used when the join type is all, index, range

3. Every join that can be buffered will be allocated a buffer, that is to say, a query may eventually use multiple join buffers

4. The first nonconst table does not allocate a join buffer, even if its scan type is all or index

5. The join buffer will be allocated before the join and released after the query is executed

6. Only the columns participating in the join will be saved in the join buffer, not the whole data row

How to use

After 5.6, the optimizer manages the parameter optimizer_ Block in switch_ nested_ The loop parameter controls whether BNL is used in the optimizer. It is on by default. If it is set to off, the optimizer will select NLJ algorithm when selecting join mode

==========END==========