What is going on here, and how can the “join optimizer” result in such relatively poor performance?
STRAIGHT_JOIN
forces the join order of the tables, so table1
is scanned in the outer loop and table2
in the inner loop.
The optimizer is not perfect (though stil quite decent), and the most probable cause is the outdated statistics.
Should I always use
STRAIGHT_JOIN
No, only when the optimizer is wrong. This may be if your data distribution is severely skewed or cannot be calculated properly (say, for spatial or fulltext indexes).
How can I tell when to use it or not?
You should collect the statistics, build the plans for both ways and understand what do these plans mean.
If you see that:
-
The automatically generated plan is not optimal and cannot be improved by the standard ways,
-
The
STRAIGHT_JOIN
version is better, you understand it always will and understand why it always will
, then use STRAIGHT_JOIN
.