Wednesday, January 27, 2010

When your projection is not cost-free

First of all I have to admit that the title of this post is a bit misleading - the projection is almost never "cost-free". Indirectly cost will be generated by accessing rows from row sources to obtain any columns selected, and with the introduction of System Statistics, Oracle even assigns a direct CPU cost to accessing a particular column of a row - as it has been pointed out by Joze Senegacnik and is demonstrated e.g. in Christian Antognini's "Troubleshooting Oracle Performance" book on page 117.

However, Oracle obviously treats work that has to be performed as part of the projection differently than work that has to be performed as part of the selection part.

In particular user-defined functions or scalar subqueries will not be accounted for costing when calculating the overall query cost.

This holds even true for user-defined functions that have a cost assigned via the Extensible Optimizer framework. For more information regarding the "Extensible Optimizer" framework, read Adrian Billington's article on oracle-developer.net or refer e.g. to our latest "Expert Oracle Practices" OakTable book where Joze Senegacnik dedicated a whole chapter to this topic.

Of course the best way to deal with this situation is to avoid using user-defined functions or scalar subqueries and replace them with appropriate joins/subqueries. In most cases this yields the best performance, provided that the optimizer comes up with a reasonable execution plan, and also solves the issue of non-contributed work, because regular join constructs will be considered in the cost calculation.

However, what to do of you're in the situation that you can't simply change the query (e.g. third party vendor application)?

If a user-defined function or scalar subquery is used as part of the selection clause the cost-based optimizer will make use of any cost associated with the function, or evaluated for the scalar subquery, whereas the same construct used as part of the projection will not be taken into account for the cost.

In particular in the case of user-defined functions that perform costly operations, for example recursive SQL, and are not declared as deterministic (or are really of undeterministic nature) this can make a significant difference.

The problem with functions that are not declared as deterministic is that the built-in caching feature of Oracle that can help with scalar subqueries or functions declared as deterministic (since Oracle 10.2) can not be used to alleviate the potentially resource-intensive numerous calls to the function.

In case of more complex queries that make use of views, the view merging transformations applied to the query therefore can lead to quite different work performed by a query.

Consider the following example setup:


drop function generate_lio;

drop table t1;

create table t1 (
run_id integer not null, /* identify the process inserting the data */
batch_id integer not null, /* represents clustered data, could also be a (arriving) date */
a_value number null, /* represents sequence based data */
a_random number null, /* represents randomly scattered data */
a_date timestamp default systimestamp not null, /* represents the insert timestamp */
filler char(1) default 'x' not null /* can be used to size the row as required */
);

create index t1_idx1 on
t1 (
a_random
);

insert into t1 (
run_id
, batch_id
, a_value
, a_random
)
select
1 as run_id
, trunc(id - 1 / 400) + 1 as batch_id
, id as a_value
, trunc(dbms_random.value(1, 40000.999999999)) as a_random
from
(
select
level as id
from
dual
connect by
level <= 40000
);

commit;

exec dbms_stats.gather_table_stats(null, 't1')

create or replace function generate_lio(in_lio in number default 1)
return number
--deterministic
as
n_val number;
begin
select /*+ first_rows(1) */
run_id
into
n_val
from
(
select
*
from
(
select
rownum as rn
, run_id
, substrb(rowid, 1, 15) as block
, a_random
from
t1
where
a_random is not null
order by
a_random
)
where rownum <= in_lio
)
where
rn = in_lio;
return trunc(dbms_random.value(0, 1000)) * n_val;
end generate_lio;
/

associate statistics with functions generate_lio default cost (100000, 3, 0) default selectivity 100;


The code creates a table holding 40,000 rows and a simple index with a bad clustering factor. A user-defined function that allows to generate logical I/O based on that index is created and associated a default cost (and selectivity) using ASSOCIATE STATISTICS. The function deliberately uses the DBMS_RANDOM package to simulate a non-deterministic behaviour. Each BATCH_ID in the table covers 400 rows, with 100 batch_ids in total.

Now consider the following query:


/* Hint is required from 11g on to prevent GROUP BY placement */
select /*+ no_place_group_by */
sum(val), t1.filler
from
t1
, (
/* See the difference in the runtime statistics
between merging this view and not */
select /* no_merge */
generate_lio(1) as val
, batch_id
from
t1
) t2
where
t1.batch_id = t2.batch_id
and t1.batch_id = 42
group by
t1.filler;


This query will generate 160,000 rows by combining 400 rows from each row source. Here is the execution plan (all tested on 11.1.0.7) when merging the view T2 by default:


----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 146 (28)| 00:00:02 |
| 1 | HASH GROUP BY | | 1 | 8 | 146 (28)| 00:00:02 |
|* 2 | HASH JOIN | | 160K| 1250K| 115 (8)| 00:00:02 |
|* 3 | TABLE ACCESS FULL| T1 | 400 | 2000 | 55 (4)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| T1 | 400 | 1200 | 55 (4)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T1"."BATCH_ID"="BATCH_ID")
3 - filter("T1"."BATCH_ID"=42)
4 - filter("BATCH_ID"=42)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=1) "T1"."FILLER"[CHARACTER,1], SUM("GENERATE_LIO"(1))[22]
2 - (#keys=1) "T1"."FILLER"[CHARACTER,1]
3 - "T1"."BATCH_ID"[NUMBER,22], "T1"."FILLER"[CHARACTER,1]
4 - "BATCH_ID"[NUMBER,22]


Notice in particular where the evaluation of the function takes place according to the "Projection" information ("+PROJECTION" option of DBMS_XPLAN.DISPLAY).

And here is the execution plan when explicitly requesting to not merge the view T2:


-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 | 146 (28)| 00:00:02 |
| 1 | HASH GROUP BY | | 1 | 21 | 146 (28)| 00:00:02 |
|* 2 | HASH JOIN | | 160K| 3281K| 115 (8)| 00:00:02 |
|* 3 | TABLE ACCESS FULL | T1 | 400 | 2000 | 55 (4)| 00:00:01 |
| 4 | VIEW | | 400 | 6400 | 55 (4)| 00:00:01 |
|* 5 | TABLE ACCESS FULL| T1 | 400 | 2400 | 55 (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T1"."BATCH_ID"="T2"."BATCH_ID")
3 - filter("T1"."BATCH_ID"=42)
5 - filter("BATCH_ID"=42)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=1) "T1"."FILLER"[CHARACTER,1], SUM("VAL")[22]
2 - (#keys=1) "T1"."FILLER"[CHARACTER,1], "VAL"[NUMBER,22]
3 - "T1"."BATCH_ID"[NUMBER,22], "T1"."FILLER"[CHARACTER,1]
4 - "VAL"[NUMBER,22], "T2"."BATCH_ID"[NUMBER,22]
5 - "BATCH_ID"[NUMBER,22]


It is not that obvious from the "Projection" information, but in this case the function (the "VAL" of the "Projection" in operation id 4) is evaluated before the join takes place.

Notice that the (still undocumented) "NO_PLACE_GROUP_BY" hint is required from 11g on to prevent the optimizer from getting too clever with this kind of statement. The GROUP BY is used in this case to simplify the result set processing aggregating it into a single row, but in 11g by default the new GROUP BY placement pushes the GROUP BY into the view, effectively solving the issue of excessive function calls by simply reducing the row source sizes that subsequently get joined. Since this is not supposed to be point of this demonstration, the clever trick of pushing the group by into the view is prevented. However it is interesting to note how new features of the optimizer can help to solve problems by side-effects.

The problem described here can still be seen in 11g without any hints when not using a GROUP BY clause to aggregate the result set.

If you run this with statistics_level set to ALL and check the runtime statistics (DBMS_XPLAN.DISPLAY_CURSOR), you'll notice a significant difference between merging the view T2 or not. By default Oracle will merge the view, and obviously perform the "projection" as part of the HASH GROUP BY operation after joining the data, leading to 160,000 calls to the function, each generating three logical I/Os (when using 1 as parameter to the function).


-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:14.58 | 480K| | | |
| 1 | HASH GROUP BY | | 1 | 1 | 1 |00:00:14.58 | 480K| 805K| 805K| 602K (0)|
|* 2 | HASH JOIN | | 1 | 160K| 160K|00:00:00.01 | 390 | 988K| 988K| 317K (0)|
|* 3 | TABLE ACCESS FULL| T1 | 1 | 400 | 400 |00:00:00.01 | 195 | | | |
|* 4 | TABLE ACCESS FULL| T1 | 1 | 400 | 400 |00:00:00.01 | 195 | | | |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T1"."BATCH_ID"="BATCH_ID")
3 - filter("T1"."BATCH_ID"=42)
4 - filter("BATCH_ID"=42)


Compare the difference in runtime and the number of logical I/Os performed ("Buffers") to this one, when not merging the view:


------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.34 | 1590 | | | |
| 1 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.34 | 1590 | 805K| 805K| 369K (0)|
|* 2 | HASH JOIN | | 1 | 160K| 160K|00:00:00.02 | 1590 | 988K| 988K| 354K (0)|
|* 3 | TABLE ACCESS FULL | T1 | 1 | 400 | 400 |00:00:00.02 | 195 | | | |
| 4 | VIEW | | 1 | 400 | 400 |00:00:00.04 | 1395 | | | |
|* 5 | TABLE ACCESS FULL| T1 | 1 | 400 | 400 |00:00:00.01 | 195 | | | |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T1"."BATCH_ID"="T2"."BATCH_ID")
3 - filter("T1"."BATCH_ID"=42)
5 - filter("BATCH_ID"=42)


So obviously it is a good idea in this particular case to not merge the view, but although a I/O cost has been explicitly assigned to the function, you can see that both execution plans have exactly the same cost and Oracle happily merges the view.

The costing looks different when using the function as part of the projection clause:


select /*+ no_place_group_by */
sum(val), t1.filler
from
t1
, (
select
1 as val
, batch_id
, a_date
, run_id
from
t1
)t2
where
t1.batch_id = t2.batch_id
and t1.batch_id = 42
and generate_lio(t2.run_id) >= 0
group by
t1.filler;


Note that I had to change the function parameter to something that refers to a column expression, otherwise the optimizer treats "generate_lio(1)" as "independent" and adds a FILTER operation that evaluates "generate_lio" exactly once.

It is now obvious from the execution plan that the cost of executing the function is considered:


----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 11 | 1356 (4)| 00:00:17 |
| 1 | HASH GROUP BY | | 1 | 11 | 1356 (4)| 00:00:17 |
|* 2 | HASH JOIN | | 160K| 1718K| 1325 (2)| 00:00:16 |
|* 3 | TABLE ACCESS FULL| T1 | 400 | 2000 | 55 (4)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| T1 | 400 | 2400 | 1265 (1)| 00:00:16 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T1"."BATCH_ID"="BATCH_ID")
3 - filter("T1"."BATCH_ID"=42)
4 - filter("BATCH_ID"=42 AND "GENERATE_LIO"("RUN_ID")>=0)


Notice the increased cost of the operation id 4 - calling the function 400 times with an associated I/O cost of 3 and adding on top the associated CPU cost.

A similar behaviour can be seen when using scalar subqueries as part of the selection or projection.

Consider this query:


select /*+ no_place_group_by */
sum(val), t1.filler
from
t1
, (
select /* no_merge */
(
select
min(run_id)
from
t1 a
where
a.a_random = b.a_random
) as val
, batch_id
from
t1 b
) t2
where
t1.batch_id = t2.batch_id
and t1.batch_id = 42
group by
t1.filler;


Checking the execution plan you'll notice again the same cost as above when using the function in the projection:


----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 146 (28)| 00:00:02 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 16 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T1_IDX1 | 2 | | 1 (0)| 00:00:01 |
| 4 | HASH GROUP BY | | 1 | 13 | 146 (28)| 00:00:02 |
|* 5 | HASH JOIN | | 160K| 2031K| 115 (8)| 00:00:02 |
|* 6 | TABLE ACCESS FULL | T1 | 400 | 2000 | 55 (4)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | T1 | 400 | 3200 | 55 (4)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."A_RANDOM"=:B1)
5 - access("T1"."BATCH_ID"="BATCH_ID")
6 - filter("T1"."BATCH_ID"=42)
7 - filter("BATCH_ID"=42)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=0) MIN("RUN_ID")[22]
2 - "RUN_ID"[NUMBER,22]
3 - "A".ROWID[ROWID,10]
4 - (#keys=1) "T1"."FILLER"[CHARACTER,1], SUM( (SELECT MIN("RUN_ID") FROM
"T1" "A" WHERE "A"."A_RANDOM"=:B1))[22]
5 - (#keys=1) "T1"."FILLER"[CHARACTER,1], "B"."A_RANDOM"[NUMBER,22]
6 - "T1"."BATCH_ID"[NUMBER,22], "T1"."FILLER"[CHARACTER,1]
7 - "BATCH_ID"[NUMBER,22], "B"."A_RANDOM"[NUMBER,22]


Notice again the subtle difference in the projection when preventing the view merge:


----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 | 146 (28)| 00:00:02 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 16 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T1_IDX1 | 2 | | 1 (0)| 00:00:01 |
| 4 | HASH GROUP BY | | 1 | 21 | 146 (28)| 00:00:02 |
|* 5 | HASH JOIN | | 160K| 3281K| 115 (8)| 00:00:02 |
|* 6 | TABLE ACCESS FULL | T1 | 400 | 2000 | 55 (4)| 00:00:01 |
| 7 | VIEW | | 400 | 6400 | 55 (4)| 00:00:01 |
|* 8 | TABLE ACCESS FULL | T1 | 400 | 4400 | 55 (4)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."A_RANDOM"=:B1)
5 - access("T1"."BATCH_ID"="T2"."BATCH_ID")
6 - filter("T1"."BATCH_ID"=42)
8 - filter("BATCH_ID"=42)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=0) MIN("RUN_ID")[22]
2 - "RUN_ID"[NUMBER,22]
3 - "A".ROWID[ROWID,10]
4 - (#keys=1) "T1"."FILLER"[CHARACTER,1], SUM("VAL")[22]
5 - (#keys=1) "T1"."FILLER"[CHARACTER,1], "VAL"[NUMBER,22]
6 - "T1"."BATCH_ID"[NUMBER,22], "T1"."FILLER"[CHARACTER,1]
7 - "VAL"[NUMBER,22], "T2"."BATCH_ID"[NUMBER,22]
8 - "BATCH_ID"[NUMBER,22], "B"."A_RANDOM"[NUMBER,22]


It is also interesting to note that although the subquery is executed as part of the VIEW projection step in operation id 7, the scalar subquery is still shown at top level of the query starting with operation id 1. It would be more accurate to show it as child of operation id 7 in this particular case, but this is probably not supported by EXPLAIN PLAN at present.

At runtime however, the outcome is different from the function case, mainly due to the filter optimization / subquery caching feature, which also makes the subquery implicitly deterministic - it will only get executed as many times as there are distinct number of input values, which is the A_RANDOM column in this case.

In both cases the subquery will be executed only approx. 400 times, because there are only 400 distinct values in the generated row source.

View merged:


-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.32 | 1418 | | | |
| 1 | SORT AGGREGATE | | 398 | 1 | 398 |00:00:00.01 | 1028 | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 398 | 2 | 781 |00:00:00.01 | 1028 | | | |
|* 3 | INDEX RANGE SCAN | T1_IDX1 | 398 | 2 | 781 |00:00:00.01 | 402 | | | |
| 4 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.32 | 1418 | 805K| 805K| 570K (0)|
|* 5 | HASH JOIN | | 1 | 160K| 160K|00:00:00.02 | 390 | 988K| 988K| 390K (0)|
|* 6 | TABLE ACCESS FULL | T1 | 1 | 400 | 400 |00:00:00.02 | 195 | | | |
|* 7 | TABLE ACCESS FULL | T1 | 1 | 400 | 400 |00:00:00.01 | 195 | | | |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."A_RANDOM"=:B1)
5 - access("T1"."BATCH_ID"="BATCH_ID")
6 - filter("T1"."BATCH_ID"=42)
7 - filter("BATCH_ID"=42)


View not merged:


-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.24 | 1418 | | | |
| 1 | SORT AGGREGATE | | 398 | 1 | 398 |00:00:00.02 | 1028 | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 398 | 2 | 781 |00:00:00.02 | 1028 | | | |
|* 3 | INDEX RANGE SCAN | T1_IDX1 | 398 | 2 | 781 |00:00:00.02 | 402 | | | |
| 4 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.24 | 1418 | 805K| 805K| 362K (0)|
|* 5 | HASH JOIN | | 1 | 160K| 160K|00:00:00.02 | 1418 | 988K| 988K| 377K (0)|
|* 6 | TABLE ACCESS FULL | T1 | 1 | 400 | 400 |00:00:00.02 | 195 | | | |
| 7 | VIEW | | 1 | 400 | 400 |00:00:00.02 | 1223 | | | |
|* 8 | TABLE ACCESS FULL | T1 | 1 | 400 | 400 |00:00:00.01 | 195 | | | |
-----------------------------------------------------------------------------------------------------------------------------


It is obvious from the "Buffers" column that the scalar subquery has been executed at different steps of the execution plan.

However in cases where you're not that lucky and the filter optimization doesn't work that efficiently, there still might be a significant difference between the merged and unmerged view variant of the query.

Again, when moving the subquery to the selection, the cost calculation looks quite different:


select /*+ no_place_group_by */
sum(val), t1.filler
from
t1
, (
select
run_id as val
, batch_id
from
t1
) t2
where
t1.batch_id = t2.batch_id
and t1.batch_id = 42
and t1.run_id >= (
select /*+ no_unnest */
min(run_id)
from
t1 a
where
a.a_random = t1.a_random
)
group by
t1.filler;



------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 19 | 349K (1)| 01:10:00 |
| 1 | HASH GROUP BY | | 1 | 19 | 349K (1)| 01:10:00 |
|* 2 | FILTER | | | | | |
|* 3 | HASH JOIN | | 160K| 2968K| 116 (9)| 00:00:02 |
|* 4 | TABLE ACCESS FULL | T1 | 400 | 2400 | 55 (4)| 00:00:01 |
|* 5 | TABLE ACCESS FULL | T1 | 400 | 5200 | 56 (6)| 00:00:01 |
| 6 | SORT AGGREGATE | | 1 | 8 | | |
| 7 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 16 | 3 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | T1_IDX1 | 2 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("T1"."RUN_ID">= (SELECT /*+ NO_UNNEST */ MIN("RUN_ID") FROM "T1"
"A" WHERE "A"."A_RANDOM"=:B1))
3 - access("T1"."BATCH_ID"="BATCH_ID")
4 - filter("BATCH_ID"=42)
5 - filter("T1"."BATCH_ID"=42)
8 - access("A"."A_RANDOM"=:B1)



It is interesting that the FILTER operation is executed "late" (and therefore potentially more often) - since it is only depending on the T1.RUN_ID and T1.A_RANDOM column it could be executed "earlier" while processing the T1 row source, and in fact this can be achieved by adding the PUSH_SUBQ hint to the subquery. I haven't investigated this further, but may be the optimizer doesn't cost the different subquery pushing options when explicitly requesting to not unnest it (the NO_UNNEST hint) - without the NO_UNNEST hint the subquery is transformed into a join in this particular case.

It can be seen that in this case the optimizer used a "worst case" approach estimating that the scalar subquery gets executed many, many times. Very likely the cost increase can be explained by 160,000 times the cost of the scalar subquery (which might be less than 3 and gets rounded up in the EXPLAIN PLAN output) due to the "late" execution.

At runtime the filter optimization again works very efficiently:


-------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.38 | 1418 | | | |
| 1 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.38 | 1418 | 805K| 805K| 362K (0)|
|* 2 | FILTER | | 1 | | 160K|00:00:00.16 | 1418 | | | |
|* 3 | HASH JOIN | | 1 | 160K| 160K|00:00:00.01 | 390 | 968K| 968K| 377K (0)|
|* 4 | TABLE ACCESS FULL | T1 | 1 | 400 | 400 |00:00:00.01 | 195 | | | |
|* 5 | TABLE ACCESS FULL | T1 | 1 | 400 | 400 |00:00:00.01 | 195 | | | |
| 6 | SORT AGGREGATE | | 398 | 1 | 398 |00:00:00.04 | 1028 | | | |
| 7 | TABLE ACCESS BY INDEX ROWID| T1 | 398 | 2 | 781 |00:00:00.02 | 1028 | | | |
|* 8 | INDEX RANGE SCAN | T1_IDX1 | 398 | 2 | 781 |00:00:00.02 | 402 | | | |
-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("T1"."RUN_ID">=)
3 - access("T1"."BATCH_ID"="BATCH_ID")
4 - filter("BATCH_ID"=42)
5 - filter("T1"."BATCH_ID"=42)
8 - access("A"."A_RANDOM"=:B1)


So it is not obvious to me why the optimizer treats these two cases differently - in case of the selection it uses a "worst-case" approach, but why this is not used in case of the projection is not clear to me.

In summary, you need to be careful in particular when using functions as part of the projection clause and complex queries - the order of evaluation might make a significant difference to the overall query performance.

As already mentioned, the best way to deal with such constructs is to avoid them. If you can not, you first should evaluate if the function can be declared as deterministic (will be cached from 10.2 on), or if you can use the workaround of wrapping the function call into a scalar subquery (select f(x) from dual) to take advantage of the subquery caching feature, which makes the function also implicitly deterministic.

2 comments:

  1. Oracle is stating that we set "_optimizer_group_by_placement" = FALSE - Does this do the same thing as the hint in the post? Any idea how this will affect performance?

    Mark Wood

    ReplyDelete
  2. Mark,

    if you are referring to the "no_place_group_by" hint, then yes, I think this does the same and effectively disables this optimization. There are certain bugs in 11.1 that you might hit and therefore support might suggest this setting.

    It will affect the performance in those cases where you currently benefit from the new optimization - there is no way to tell how much impact this will have on your performance without further analysis.

    If your system was already running on 10g, then your SQLs were already running without that optimization since it was introduced in 11.1 - may be this can give you a clue about the impact, but it is not necessarily a fair comparison since you might benefit from other optimizer changes introduced in 11.1.

    For more information about that optimization, you can e.g. read the blog post by Jonathan Lewis.

    Randolf

    ReplyDelete