Using multiple tables_ Hash hint, are you right?

oracle online documentri,

http://docs.oracle.com/cd/E16338_01/server.112/b56299/img_text/use_hash_hint.htm >>

/*+ USE_HASH ( [ @ queryblock ] tablespec [ tablespec ]... ) */

And most developers do write like this: use_ Hash (a, b), this is really no problem

When there are more than two associated tables, write as use_ Is there a problem with hash (a B C D)

Let’s first take a look at a test case. This case is adapted from a real case of a customer. It simulates that under complex table Association conditions, the optimizer overestimates the result set after table Association, and may use the wrong execution plan. We hope that by adding use_ Hash hint to optimize SQL

–Create 4 tables

create table tv as select rownum as id,a.* from dba_ objects a;

create table tt as select * from tv;

create table tw as select * from tv;

create table tu as select * from tv;

–Collect statistics

exec dbms_ stats.gather_ table_ stats(user,’tw’);

exec dbms_ stats.gather_ table_ stats(user,’tt’);

exec dbms_ stats.gather_ table_ stats(user,’tu’);

exec dbms_ stats.gather_ table_ stats(user,’tv’);

SQL:

select /*+ use_ hash(v t u w) */

count(*) from tv v,tu u,tw w,tt t

where

t.id=v.id and t.object_ name=upper(v.object_ name) and

w.id=u.id and

v.created between t.created and t.last_ ddl_ time and

v.created between u.created and u.last_ ddl_ time and

t.object_ id=w.object_ id and w.created=v.created;

The real case is: the normal execution time of SQL is 3.4 minutes. One day, after deleting some records from TV table, the execution has not been completed for nearly 20 minutes. The most important change is that one of the hash join in the execution plan has become nested loops, although hint has specified that all tables should be used_ hash。

The simulated SQL shows the abnormal situation in the real case. One of the steps uses nested loops, which takes about 4 minutes to complete (you can cancel the test). The execution plan of all hashes is less than 1 second

Execution plan generated by current hint:

In this execution plan, there are nested loops, and hash is not used according to hint_ It shows that there is something wrong with the writing of hint

So, what is the correct way to write it

The crux of the problem is:

Use of multiple tables_ Hash must be used with leading hint

According to the association condition of SQL, we add the hint of leading to test again

select /*+ leading(v t w u) use_ hash(u v t w) */

count(*) from tv v,tu u,tw w,tt t

where

t.id=v.id and t.object_ name=upper(v.object_ name) and

w.id=u.id and

v.created between t.created and t.last_ ddl_ time and

v.created between u.created and u.last_ ddl_ time and

t.object_ id=w.object_ id and w.created=v.created;

This time, SQL only needs less than one second to get the result, and the execution plan is exactly all the hash join we need

In the standard execution plan outline data generated inside the optimizer, the above hint is finally transformed into this:

(have you noticed that the first table of leading is not marked with use_ hash(V)?

This is because there is a second table_ The existence of hash (T), the use of T table_ Hash (T), with whom?And the first table V, of course

Conclusion

We’re writing multiple tables_ hash(use_ NL is the same as hint_ In the parentheses of hash, you can put multiple tables (in order independent), but you must combine the leading hint to ensure that the optimizer does not use other join methods. The order of the tables in leading is very important. If you make a mistake, it will take you to see Cartesian join

Share continuous updates, please pay attention to: Tiger Liu talks about SQL optimization

Script sharing in QQ group: 16778072

Welcome to share with more friends

For the convenience of communication, interested friends can join the wechat group with the same name:

this article shares the performance optimization of WeChat official account tiger Liu Tan Oracle (sql_ tigerliu)。

Similar Posts: