Thursday, May 7, 2015

Heuristic Temp Table Transformation - 2

Some time ago I've demonstrated the non-cost based decision for applying the temp table transformation when using CTEs (Common Table/Subquery Expressions). In this note I want to highlight another aspect of this behaviour.

Consider the following data creating a table with delibrately wide columns:
create table a
as
  select
          rownum as id
        , rownum as id2
        , rpad('x', 4000) as large_vc1
        , rpad('x', 4000) as large_vc2
        , rpad('x', 4000) as large_vc3
from
          dual
connect by
          level <= 1000
;

exec dbms_stats.gather_table_stats(null, 'a')
and this query and plans with and without the temp table transformation:
with cte
as
(
  select  /* inline */
          id
        , id2
        , large_vc1
        , large_vc2
        , large_vc3
from
          a
where
          1 = 1
)
select
        *
from
        (
          select id, count(*) from cte group by id
        ) a,
        (
          select id2, count(*) from cte group by id2
        ) b
where
        a.id = b.id2
;

-- Plan with TEMP TABLE transformation
--------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                           |  1000 | 52000 |  1341   (1)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |          |
|   2 |   LOAD AS SELECT           | SYS_TEMP_0FD9D6609_26FA32 |       |       |            |          |
|   3 |    TABLE ACCESS FULL       | A                         |  1000 |    11M|   452   (0)| 00:00:01 |
|*  4 |   HASH JOIN                |                           |  1000 | 52000 |   889   (1)| 00:00:01 |
|   5 |    VIEW                    |                           |  1000 | 26000 |   444   (1)| 00:00:01 |
|   6 |     HASH GROUP BY          |                           |  1000 |  4000 |   444   (1)| 00:00:01 |
|   7 |      VIEW                  |                           |  1000 |  4000 |   443   (0)| 00:00:01 |
|   8 |       TABLE ACCESS FULL    | SYS_TEMP_0FD9D6609_26FA32 |  1000 |    11M|   443   (0)| 00:00:01 |
|   9 |    VIEW                    |                           |  1000 | 26000 |   444   (1)| 00:00:01 |
|  10 |     HASH GROUP BY          |                           |  1000 |  4000 |   444   (1)| 00:00:01 |
|  11 |      VIEW                  |                           |  1000 |  4000 |   443   (0)| 00:00:01 |
|  12 |       TABLE ACCESS FULL    | SYS_TEMP_0FD9D6609_26FA32 |  1000 |    11M|   443   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

-- Plan with CTE inlined (turn INLINE into hint)
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1000 | 52000 |   907   (1)| 00:00:01 |
|*  1 |  HASH JOIN           |      |  1000 | 52000 |   907   (1)| 00:00:01 |
|   2 |   VIEW               |      |  1000 | 26000 |   453   (1)| 00:00:01 |
|   3 |    HASH GROUP BY     |      |  1000 |  4000 |   453   (1)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| A    |  1000 |  4000 |   452   (0)| 00:00:01 |
|   5 |   VIEW               |      |  1000 | 26000 |   453   (1)| 00:00:01 |
|   6 |    HASH GROUP BY     |      |  1000 |  4000 |   453   (1)| 00:00:01 |
|   7 |     TABLE ACCESS FULL| A    |  1000 |  4000 |   452   (0)| 00:00:01 |
-----------------------------------------------------------------------------
Looking at the query and plan output the following becomes obvious:

- The mere existence of a WHERE clause, even if it is just "WHERE 1 = 1" and referencing the CTE more than once triggers the transformation (nothing new, already demonstrated in the mentioned previous note, as well as the fact that the inlined CTE variant is cheaper in cost)

- There is a huge difference between the estimated size of the TEMP TABLE and the size of the row sources when using the CTE inline

The latter is particular noteworthy: Usually Oracle is pretty clever in optimizing the projection and uses only those columns required (doesn't apply to the target expression of MERGE statements, by the way), which is reflected in the plan output for the inline CTEs - the wide columns don't matter here because they aren't referenced, although being mentioned in the CTE. But in case of the temp table transformation obviously all columns / expressions mentioned in the CTE become materialized, although not necessarily being referenced when the CTE gets used.

So it would be nice if Oracle only materialized those columns / expressions actually used.

Now you might raise the question why mention columns and expressions in the CTE that don't get used afterwards: Well, generic approaches sometimes lead to such constructs - imagine the CTE part was static, including all possible attributes, but the actual usage of the CTE can be customized by a client. In such cases where only a small part of the available attributes get actually used a temp table transformation can lead to a huge overhead in size of the generated temp table. Preventing the transformation addresses this issue, but then the inlined CTE will have to be evaluated as many times as referenced - which might not be desirable either.

Monday, April 27, 2015

Enabling Edition Based Redefinition On A Schema With Object-Relational Tables

This is just a heads-up for those thinking about using Edition Based Redefinition (EBR) and enabling it on an existing schema with objects. Although EBR isn't exactly a new feature its current adoption level is probably not that high (which probably changes in future as Oracle E-Business Suite uses EBR now as part of their default upgrade procedure as far as I understood).

I was recently contacted by someone who enabled EBR on an existing schema using ALTER USER ... ENABLE EDITIONS and had to use the "FORCE" option since there were (according to the official ALTER USER documentation) "objects that are not editionable and that depend on editionable type objects in the schema. ... In this case, all the objects that are not editionable and that depend on the editionable type objects in the schema being editions-enabled become invalid".

Although one could say it is clearly mentioned in the documentation, the consequences are probably not that obvious to everyone if those non-editionable objects are tables having columns based on user-defined types. So I state it here to make it hopefully clear enough:

If you use the FORCE option of ALTER USER ... ENABLE EDITIONS to enable editions on an existing schema already containing objects and among those objects are tables having columns based on user-defined types, then effectively those tables will become invalid and stay invalid. There is no officially documented way to reverse this step or compile tables to become valid again (there's no ALTER TABLE COMPILE or similar). The table cannot be accessed any longer and all data contained is hence officially gone, too.

This means the affected tables need to be restored from a (hopefully existing and usable) backup (I don't know if there's a dictionary hack available that is officially sanctioned by Oracle support to make the table valid again).

In my opinion the FORCE option should check if the schema contains such dependent tables and in such a case error out with an error message that the table needs to be dropped first (or moved to a different schema) before ENABLE EDITIONS can succeed. This would make the situation much clearer, rather than leaving tables in INVALID state behind that cannot be fixed/recompiled afterwards.

Below is a simple test case that demonstrates the issue:
-- Create a test user for enabling editions
drop user ebr_test cascade;

create user ebr_test identified by ebr_test;

grant dba to ebr_test;

-- Connect as test user
connect ebr_test/ebr_test

-- and create a object relational type 
create type test_coll_type as table of number;

-- and a table having such a column of that type
create table test_ebr_table (col1 number, col2 number, test_coll test_coll_type) nested table test_coll store as test_coll_table;

-- Some test data
insert into test_ebr_table (col1, col2, test_coll) values (1, 1, test_coll_type(1, 2 ,3));

commit;

-- Everything is fine so far
select * from test_ebr_table, table(test_coll);

-- Enable editions, FORCE is required
alter user ebr_test enable editions force;

-- This no longer works (ORA-04063: table "EBR_TEST.TEST_EBR_TABLE" has errors)
select * from test_ebr_table, table(test_coll);

-- Not even simple scalar values can be accessed from the table
select col1 from test_ebr_table;

-- The table has gone INVALID
select status from user_objects where object_name = 'TEST_EBR_TABLE';