Sunday, December 21, 2014

New Version Of XPLAN_ASH Utility

A new version 4.2 of the XPLAN_ASH utility is available for download.

As usual the latest version can be downloaded here.

There were no too significant changes in this release, mainly some new sections related to I/O figures were added.

One thing to note is that some of the sections in recent releases may require a linesize larger than 700, so the script's settings have been changed to 800. If you use corresponding settings for CMD.EXE under Windows for example you might have to adjust accordingly to prevent ugly line wrapping.

Here are the notes from the change log:

- New sections "Concurrent activity I/O Summary based on ASH" and "Concurrent activity I/O Summary per Instance based on ASH" to see the I/O activity summary for concurrent activity

- Bug fixed: When using MONITOR as source for searching for the most recent SQL_ID executed by a given SID due to some filtering on date no SQL_ID was found. This is now fixed

- Bug fixed: In RAC GV$ASH_INFO should be used to determine available samples

- The "Parallel Execution Skew ASH" indicator is now weighted - so far any activity level per plan line and sample below the actual DOP counted as one, and the same if the activity level was above
The sum of the "ones" was then set relative to the total number of samples the plan line was active to determine the "skewness" indicator

Now the actual difference between the activity level and the actual DOP is calculated and compared to the number of total samples active times the actual DOP
This should give a better picture of the actual impact the skew has on the overall execution

- Most queries now use a NO_STATEMENT_QUEUING hint for environments where AUTO DOP is enabled and the XPLAN_ASH queries could get queued otherwise

- The physical I/O bytes on execution plan line level taken from "Real-Time SQL Monitoring" has now the more appropriate heading "ReadB" and "WriteB", I never liked the former misleading "Reads"/"Writes" heading

Sunday, October 26, 2014

Heuristic TEMP Table Transformation

There are at least three different ways how the Oracle optimizer can come up with a so called TEMP table transformation, that is materializing an intermediate result set:

- As part of a star transformation the repeated access to dimensions can be materialized

- As part of evaluating GROUPING SETs intermediate result sets can be materialized

- Common Subquery/Table Expressions (CTE, WITH clause)

Probably the most common usage of the materialization is in conjunction with the WITH clause.

This is nothing new but since I came across this issue several times recently, here's a short demonstration and a reminder that this so called "TEMP Table Transformation" - at least in the context of the WITH clause - isn't really cost-based, in contrast to most other optimizer transformations nowadays - although the unnest transformation of subqueries also has a "no-brainer" variant where costing isn't considered.

The logic simply seems to be: If the CTE expression is referenced more than once AND the CTE expression contains at least some (filter or join) predicate then it will be materialized.

While in most cases this makes sense to avoid the otherwise repeated evaluation of the CTE expression, there are cases where additional predicates that could be pushed inside the CTE would lead to different, significantly more efficient access paths than materializing the full CTE expression without applying the filters and filtering on the TEMP table afterwards.

Here are just two very simple examples that demonstrate the point, both based on this sample table setup:

create table t1
as
select 
        rownum as id
      , rpad('x', 100) as filler
from 
        dual
connect by 
        level <=1e5;

exec dbms_stats.gather_table_stats(null, 't1')

create index t1_idx on t1 (id);

The index on T1.ID opens up potentially a very precise access to rows.

Here is example number one:

with 
a as 
(
  select  /* inline */ 
          id
        , filler 
  from    
          t1 
  where 
          filler != 'x'
)
select
        t1.*
      , a1.filler
      , a2.filler
from
        t1
      , a a1
      , a a2
where
       a1.id = t1.id
and    a2.id = t1.id
and    t1.id = 1
and    a1.id = 1
and    a2.id = 1
;

-- 11.2.0.3 Plan without INLINE hint
------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                           |     1 |   236 |  1207   (1)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION     |                           |       |       |            |          |
|   2 |   LOAD AS SELECT               | SYS_TEMP_0FD9D6619_229329 |       |       |            |          |
|*  3 |    TABLE ACCESS FULL           | T1                        | 99999 |    10M|   420   (1)| 00:00:01 |
|*  4 |   HASH JOIN                    |                           |     1 |   236 |   787   (1)| 00:00:01 |
|*  5 |    HASH JOIN                   |                           |     1 |   171 |   394   (1)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T1                        |     1 |   106 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN          | T1_IDX                    |     1 |       |     1   (0)| 00:00:01 |
|*  8 |     VIEW                       |                           | 99999 |  6347K|   392   (1)| 00:00:01 |
|   9 |      TABLE ACCESS FULL         | SYS_TEMP_0FD9D6619_229329 | 99999 |    10M|   392   (1)| 00:00:01 |
|* 10 |    VIEW                        |                           | 99999 |  6347K|   392   (1)| 00:00:01 |
|  11 |     TABLE ACCESS FULL          | SYS_TEMP_0FD9D6619_229329 | 99999 |    10M|   392   (1)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

-- 11.2.0.4 Plan without INLINE hint
--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                           |  9999M|  2197G|       | 28468  (92)| 00:00:02 |
|   1 |  TEMP TABLE TRANSFORMATION     |                           |       |       |       |            |          |
|   2 |   LOAD AS SELECT               | SYS_TEMP_0FD9D661A_229329 |       |       |       |            |          |
|*  3 |    TABLE ACCESS FULL           | T1                        | 99999 |    10M|       |   420   (1)| 00:00:01 |
|*  4 |   HASH JOIN                    |                           |  9999M|  2197G|  7520K| 28048  (93)| 00:00:02 |
|*  5 |    VIEW                        |                           | 99999 |  6347K|       |   392   (1)| 00:00:01 |
|   6 |     TABLE ACCESS FULL          | SYS_TEMP_0FD9D661A_229329 | 99999 |    10M|       |   392   (1)| 00:00:01 |
|*  7 |    HASH JOIN                   |                           | 99999 |    16M|       |   394   (1)| 00:00:01 |
|   8 |     TABLE ACCESS BY INDEX ROWID| T1                        |     1 |   106 |       |     2   (0)| 00:00:01 |
|*  9 |      INDEX RANGE SCAN          | T1_IDX                    |     1 |       |       |     1   (0)| 00:00:01 |
|* 10 |     VIEW                       |                           | 99999 |  6347K|       |   392   (1)| 00:00:01 |
|  11 |      TABLE ACCESS FULL         | SYS_TEMP_0FD9D661A_229329 | 99999 |    10M|       |   392   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------------

-- 11.2.0.3/11.2.0.4 Plan with INLINE hint
-----------------------------------------------------------------------------------------
| Id  | Operation                      | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |        |     1 |   318 |     6   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                  |        |       |       |            |          |
|   2 |   NESTED LOOPS                 |        |     1 |   318 |     6   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                |        |     1 |   212 |     4   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS BY INDEX ROWID| T1     |     1 |   106 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T1     |     1 |   106 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN          | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|*  8 |    INDEX RANGE SCAN            | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|*  9 |   TABLE ACCESS BY INDEX ROWID  | T1     |     1 |   106 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------


The filter in the CTE expression is just there to fulfill the rules I've stated above, without it the TEMP table transformation wouldn't be considered at all. It could also be a (non-filtering) join condition, for example.

Notice the big difference in cost estimates between the plans with and without materialization. Clearly a cost-based evaluation should have rejected the TEMP table transformation, simply because it is a bad idea to materialize 100K rows and afterwards access this TEMP table twice to filter out exactly a single row, instead of accessing the original, untransformed row source twice via precise index access.

This is by the way an example of another anomaly that was only recently introduced (apparently in the 11.2.0.4 patch set / 12.1 release): Notice the bad cardinality estimate in the 11.2.0.4 plan with the TEMP table transformation - the filter on the TEMP table isn't evaluated properly (was already there in previous releases) and in addition the join cardinality is way off - 10G rows instead of a single row is not really a good estimate - and as a side effect the HASH JOIN uses a bad choice for the build row sources.

Another possible, perhaps less common variant is this example:

with 
a as 
(
  select  /* inline */ 
          id
        , filler 
  from 
          t1 
  where 
          filler != 'x'
)
select
        id
      , (select filler from a where id = x.id) as scal_val1
      , (select filler from a where id = x.id) as scal_val2
from
        t1 x
;

-- 12.1.0.2 Plan without INLINE hint
--------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                           |   100K|   488K|    77M  (1)| 00:50:26 |
|*  1 |  VIEW                      |                           | 99999 |  6347K|   392   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL        | SYS_TEMP_0FD9D660F_229329 | 99999 |    10M|   392   (1)| 00:00:01 |
|*  3 |  VIEW                      |                           | 99999 |  6347K|   392   (1)| 00:00:01 |
|   4 |   TABLE ACCESS FULL        | SYS_TEMP_0FD9D660F_229329 | 99999 |    10M|   392   (1)| 00:00:01 |
|   5 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |          |
|   6 |   LOAD AS SELECT           | SYS_TEMP_0FD9D660F_229329 |       |       |            |          |
|*  7 |    TABLE ACCESS FULL       | T1                        | 99999 |    10M|   420   (1)| 00:00:01 |
|   8 |   TABLE ACCESS FULL        | T1                        |   100K|   488K|   420   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

-- 12.1.0.2 Plan with INLINE hint
----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |   100K|   488K|   398K  (1)| 00:00:16 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   106 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|*  3 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   106 |     2   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN                  | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|   5 |  TABLE ACCESS FULL                  | T1     |   100K|   488K|   420   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

This time I've shown plans from 12.1.0.2 - the latest available release as I write this - to demonstrate that this hasn't changed yet. What has changed in 12c is that the scalar subqueries are now actually represented in the final cost - in pre-12c these costs wouldn't be part of the total cost. So although due to that the cost difference between the two plans in 12c is much more significant than in pre-12c the optimizer still opts for materializing the CTE expression and running full table scans in the scalar subqueries on that temp table instead of taking advantage of the precise access path available - again very likely a pretty bad idea at runtime.

So whenever you make use of the WITH clause make sure you've considered the access paths that might be available when not materializing the result set.

Footnote

As of Oracle 12.1 the MATERIALIZE and INLINE hints are still not officially documented.