Chinar Aliyev`s blog

November 6, 2018

Understanding Hybrid Histogram

Filed under: CBO,Hybrid Histogram,statistics — Chinar Aliyev @ 11:19 am

In this blog post we are going to explore important properties of hybrid histogram (HH). The concept of bucket and average bucket sizes will be explained. These will help us to understand the creation and nature of HH. We will test same table (with same data) in Oracle Database 12.2.0.1 and 18.3.0.0. In addition, we will review how the defect has been solved in the construction of HH. We will clear how the fix “25994960: CARDINALITY MISESTIMATE FROM HYBRID HISTOGRAM” improves the efficiency of HH. It actually, prevents losing the most popular value(s) from the histogram, we will see how it has been done. The Oracle Database 18c contains the fix but 12.2 does not. Therefore, we are performing the following test case in 12.2 and 18c
Let`s see the case:


CREATE TABLE t1 AS
SELECT
       CASE
         WHEN level =960 and level <=997 then 997
         ELSE level
       END AS col
FROM   dual 
CONNECT BY level <=1000; --> comment to avoid WordPress format issue 

The table data

 SELECT   col, COUNT ( * ), SUM (COUNT ( * )) OVER () cnt
    FROM   t1
GROUP BY   col
  HAVING   COUNT ( * ) > 1
ORDER BY   1;

       COL   COUNT(*)        CNT
---------- ---------- ----------
         1         23        638
         2         17        638
         3         24        638
         4         24        638
         5         24        638
         6         18        638
         7         16        638
         8         24        638
         9         14        638
        10         37        638
        11         14        638
        12         13        638
        13         16        638
        14         24        638
        15         21        638
        16         15        638
        17         20        638
        18         19        638
        19         25        638
        20         29        638
        21         25        638
        22         21        638
        23         19        638
        24         23        638
        25         15        638
        26         23        638
        27         23        638
        28         19        638
        29         15        638
       997         38        638

30 rows selected.

SQL>   SELECT   MIN (col), MAX (col), cnt
    FROM   (  SELECT   col, COUNT ( * ), SUM (COUNT ( * )) OVER () cnt
                FROM   t1
            GROUP BY   col
              HAVING   COUNT ( * ) = 1
            ORDER BY   1)
GROUP BY   cnt;

  2    3    4    5    6    7
  MIN(COL)   MAX(COL)        CNT
---------- ---------- ----------
       601       1000        362

SQL> 

As you see the table contains 638 rows (number of distinct values for these rows is 30) with frequencies greater than one and 362 rows (number of distinct values for these rows is 362) with one frequency. So there is 392 distinct value in the table column. We are using same table data in both 12.2 and 18.3 Oracle Databases to generate HH.

begin
        dbms_stats.gather_table_stats(
                ownname          => null,
                tabname          => 't1',
                method_opt       => 'for columns col size 80'
        );
end;
/

The histogram data from Oracle DB 12.2

SQL> select
  2          endpoint_value                                                            value,
        endpoint_number,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) bucket_size,
        endpoint_repeat_count
from
        user_tab_histograms
where
        table_name  = 'T1'
and     column_name = 'COL'
order by
        endpoint_value
; 
     VALUE ENDPOINT_NUMBER BUCKET_SIZE ENDPOINT_REPEAT_COUNT
---------- --------------- ----------- ---------------------
         1              23          23                    23
         2              40          17                    17
         3              64          24                    24
         4              88          24                    24
         5             112          24                    24
         6             130          18                    18
         7             146          16                    16
         8             170          24                    24
         9             184          14                    14
        10             221          37                    37
        11             235          14                    14
        12             248          13                    13
        13             264          16                    16
        14             288          24                    24
        15             309          21                    21
        16             324          15                    15
        17             344          20                    20
        18             363          19                    19
        19             388          25                    25
        20             417          29                    29
        21             442          25                    25
        22             463          21                    21
        23             482          19                    19
        24             505          23                    23
        25             520          15                    15
        26             543          23                    23
        27             566          23                    23
        28             585          19                    19
        29             600          15                    15
       607             607           7                     1
       614             614           7                     1
       621             621           7                     1
       628             628           7                     1
       635             635           7                     1
       642             642           7                     1
       649             649           7                     1
       655             655           6                     1
       662             662           7                     1
       669             669           7                     1
       676             676           7                     1
       683             683           7                     1
       690             690           7                     1
       697             697           7                     1
       704             704           7                     1
       711             711           7                     1
       718             718           7                     1
       725             725           7                     1
       732             732           7                     1
       739             739           7                     1
       745             745           6                     1
       752             752           7                     1
       759             759           7                     1
       766             766           7                     1
       773             773           7                     1
       780             780           7                     1
       787             787           7                     1
       794             794           7                     1
       801             801           7                     1
       808             808           7                     1
       815             815           7                     1
       822             822           7                     1
       828             828           6                     1
       835             835           7                     1
       842             842           7                     1
       849             849           7                     1
       856             856           7                     1
       863             863           7                     1
       870             870           7                     1
       877             877           7                     1
       884             884           7                     1
       891             891           7                     1
       898             898           7                     1
       905             905           7                     1
       911             911           6                     1
       918             918           7                     1
       925             925           7                     1
       932             932           7                     1
       939             939           7                     1
       946             946           7                     1
      1000            1000          54                     1

80 rows selected.

SQL> SELECT   num_distinct, num_buckets, density,histogram
  FROM   user_tab_col_statistics
 WHERE   table_name = 'T1' AND column_name = 'COL'  2    3
  4  ;

NUM_DISTINCT NUM_BUCKETS    DENSITY HISTOGRAM
------------ ----------- ---------- ---------------
         392          80    .001969 HYBRID

Now the histogram data from Oracle DB 18.3

SQL> select
  2          endpoint_value                                                            value,
        endpoint_number,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) bucket_size,
        endpoint_repeat_count
from
        user_tab_histograms
where
        table_name  = 'T1'
and     column_name = 'COL'
order by
        endpoint_value  3    4    5    6    7    8    9   10   11   12
 13  ;

     VALUE ENDPOINT_NUMBER BUCKET_SIZE ENDPOINT_REPEAT_COUNT
---------- --------------- ----------- ---------------------
         1              23          23                    23
         2              40          17                    17
         3              64          24                    24
         4              88          24                    24
         5             112          24                    24
         6             130          18                    18
         7             146          16                    16
         8             170          24                    24
         9             184          14                    14
        10             221          37                    37
        11             235          14                    14
        12             248          13                    13
        13             264          16                    16
        14             288          24                    24
        15             309          21                    21
        16             324          15                    15
        17             344          20                    20
        18             363          19                    19
        19             388          25                    25
        20             417          29                    29
        21             442          25                    25
        22             463          21                    21
        23             482          19                    19
        24             505          23                    23
        25             520          15                    15
        26             543          23                    23
        27             566          23                    23
        28             585          19                    19
        29             600          15                    15
       608             608           8                     1
       615             615           7                     1
       622             622           7                     1
       629             629           7                     1
       637             637           8                     1
       644             644           7                     1
       651             651           7                     1
       658             658           7                     1
       666             666           8                     1
       673             673           7                     1
       680             680           7                     1
       687             687           7                     1
       695             695           8                     1
       702             702           7                     1
       709             709           7                     1
       716             716           7                     1
       724             724           8                     1
       731             731           7                     1
       738             738           7                     1
       745             745           7                     1
       752             752           7                     1
       760             760           8                     1
       767             767           7                     1
       774             774           7                     1
       781             781           7                     1
       789             789           8                     1
       796             796           7                     1
       803             803           7                     1
       810             810           7                     1
       818             818           8                     1
       825             825           7                     1
       832             832           7                     1
       839             839           7                     1
       847             847           8                     1
       854             854           7                     1
       861             861           7                     1
       868             868           7                     1
       876             876           8                     1
       883             883           7                     1
       890             890           7                     1
       897             897           7                     1
       905             905           8                     1
       912             912           7                     1
       919             919           7                     1
       926             926           7                     1
       933             933           7                     1
       941             941           8                     1
       948             948           7                     1
       955             955           7                     1
       997             997          42                    38
      1000            1000           3                     1

80 rows selected.

SQL> SELECT   num_distinct, num_buckets, density,histogram
  FROM   user_tab_col_statistics
 WHERE   table_name = 'T1' AND column_name = 'COL'
  2    3    4  ;

NUM_DISTINCT NUM_BUCKETS    DENSITY HISTOGRAM
------------ ----------- ---------- ---------------
         392          80      .0019 HYBRID

As you see the 12c HH does not include the most frequent value (997) but 18c HH does. Although the table data are same the histogram data are different in the both databases and that is why the densities also are different. To understand the problem we need to trace statistics gathering process in both database.
Trace files in both database shows that the same SQL statement and same data were used to create HH

12c trace:

DBMS_STATS: Building Histogram for COL
DBMS_STATS:  bktnum=80, nnv=1000, snnv=1000, sndv=392, est_ndv=392, mnb=80
DBMS_STATS:  Trying hybrid histogram 
DBMS_STATS: select substrb(dump(val,16,0,64),1,240) ep,  freq, cdn, ndv, (sum(pop) over()) popcnt,  (sum(pop*freq) over()) popfreq,  substrb(dump(max(val) over(),16,0,64),1,240) maxval,  substrb(dump(min(val) over(),16,0,64),1,240) minval  from (select val, freq,  (sum(freq) over()) cdn, (count(*) over()) ndv,  (case when freq > ((sum(freq) over())/80)  then 1  else 0 end) pop from  (select /*+  no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad  */ "COL"  val, count("COL") freq  from "SYSTEM"."T1" t  where "COL" is not null  group by "COL")) order by val
DBMS_STATS:  > cdn 1000, popFreq 638, popCnt 30, bktSize 6.91836734693877551020408163265306122449, bktSzFrc .91836734693877551020408163265306122449
DBMS_STATS:  Evaluating hybrid histogram:  cht.count 80, mnb 80, ssize 1000, min_ssize 2500, appr_ndv  TRUE, ndv 392, selNdv 15, selFreq 343, pct 100, avg_bktsize 13, csr.hreq TRUE, normalize TRUE
DBMS_STATS:   Histogram gathering flags: 7
DBMS_STATS:  Accepting histogram 

18c trace:

DBMS_STATS: Building Histogram for COL
DBMS_STATS:  bktnum=80, nnv=1000, snnv=1000, sndv=392, est_ndv=392, mnb=80
DBMS_STATS:  Trying hybrid histogram 
DBMS_STATS: select substrb(dump(val,16,0,64),1,240) ep,  freq, cdn, ndv, (sum(pop) over()) popcnt,  (sum(pop*freq) over()) popfreq,  substrb(dump(max(val) over(),16,0,64),1,240) maxval,  substrb(dump(min(val) over(),16,0,64),1,240) minval  from (select val, freq,  (sum(freq) over()) cdn, (count(*) over()) ndv,  (case when freq > ((sum(freq) over())/80)  then 1  else 0 end) pop from  (select /*+  no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad  */ "COL"  val, count("COL") freq  from "CHINAR"."T1" t  where "COL" is not null  group by "COL")) order by val

(SQL_HH)

DBMS_STATS:  > cdn 1000, popFreq 638, popCnt 30, bktSize 7.24, bktSzFrc .24
DBMS_STATS:  Evaluating hybrid histogram:  cht.count 80, mnb 80, ssize 1000, min_ssize 2500, appr_ndv  TRUE, ndv 392, selNdv 16, selFreq 381, pct 100, avg_bktsize 13, csr.hreq TRUE, normalize TRUE
DBMS_STATS:   Histogram gathering flags: 7
DBMS_STATS:  Accepting histogram

It is clearly seen that both trace files reports that there cardinality (cdn), popular frequency (popFreq), popular value count (popCnt) and average bucket size are same except the bucket size (original). The bucket size was calculated 6.9183 in 12c and 7.24 in 18c. And this is the answer that is why 12c histogram does not contain the most frequent value(s).

Now, what is the bucket size and how is it calculated?

In order to create HH oracle firstly calculates the bucket size and then uses it against the data returned by the (SQL_HH) SQL statement. Let’s quick review that data. In both Oracle database version the same data is retrieved:

  SELECT   val,
           freq,
           cdn,
           ndv,
           (SUM (pop) OVER ()) popcnt,
           (SUM (pop * freq) OVER ()) popfreq
    FROM   (SELECT   val,
                     freq,
                     (SUM (freq) OVER ()) cdn,
                     (COUNT ( * ) OVER ()) ndv,
                     (CASE
                          WHEN freq > ( (SUM (freq) OVER ()) / 80) THEN 1
                          ELSE 0
                      END)
                         pop
              FROM   (  SELECT"COL"
                                     val,
                                 COUNT ("COL") freq
                          FROM   "CHINAR"."T1" t
                         WHERE   "COL" IS NOT NULL
                      GROUP BY   "COL"))
ORDER BY   val;

       VAL       FREQ        CDN        NDV     POPCNT    POPFREQ
---------- ---------- ---------- ---------- ---------- ----------
         1         23       1000        392         30        638
         2         17       1000        392         30        638
         3         24       1000        392         30        638
         4         24       1000        392         30        638
         5         24       1000        392         30        638
         6         18       1000        392         30        638
         7         16       1000        392         30        638
         8         24       1000        392         30        638
         9         14       1000        392         30        638
        10         37       1000        392         30        638
        11         14       1000        392         30        638
        12         13       1000        392         30        638
        13         16       1000        392         30        638
        14         24       1000        392         30        638
        15         21       1000        392         30        638
        16         15       1000        392         30        638
        17         20       1000        392         30        638
        18         19       1000        392         30        638
        19         25       1000        392         30        638
        20         29       1000        392         30        638
        21         25       1000        392         30        638
        22         21       1000        392         30        638
        23         19       1000        392         30        638
        24         23       1000        392         30        638
        25         15       1000        392         30        638
        26         23       1000        392         30        638
        27         23       1000        392         30        638
        28         19       1000        392         30        638
        29         15       1000        392         30        638
       601          1       1000        392         30        638
       602          1       1000        392         30        638
       603          1       1000        392         30        638
       604          1       1000        392         30        638
       605          1       1000        392         30        638
       606          1       1000        392         30        638
       607          1       1000        392         30        638
       608          1       1000        392         30        638
.................................................................
       947          1       1000        392         30        638
       948          1       1000        392         30        638
       949          1       1000        392         30        638
       950          1       1000        392         30        638
       951          1       1000        392         30        638
       952          1       1000        392         30        638
       953          1       1000        392         30        638
       954          1       1000        392         30        638
       955          1       1000        392         30        638
       956          1       1000        392         30        638
       957          1       1000        392         30        638
       958          1       1000        392         30        638
       959          1       1000        392         30        638
       997         38       1000        392         30        638
       998          1       1000        392         30        638
       999          1       1000        392         30        638
      1000          1       1000        392         30        638

392 rows selected.

The bucket size should be calculated properly in order to include the most frequent values as much as possible. Where does the bucket and its size come from? We firstly will explain it.

There we have CDN (number of rows in the table or histogram cardinality when using sample data), number of popular values (PopCnt) and their frequencies – popular frequencies (PopFreq). So, it means There (NDV-PopCnt) number of distinct values of unpopular values and (CDN-PopFreq) unpopular rows (or their total frequencies). Also, we need to create N (in our case 80) number of buckets and it should contain all top-frequent values. Each popular value should be located in a separate bucket, therefore, we have to reserve PopCnt number of buckets for popular values. It means there are (N-PopCnt) number of buckets for unpopular rows. We actually distribute the table (column) rows into two types of buckets: popular and unpopular buckets. So, each unpopular bucket will contain:

 Unpopular_rows/unpopular_buckets = (CDN-PopFreq) / (N-PopCnt)     (F.1)

This formula explains the concept of the bucket size. So, the bucket size is calculated based on the formula (F.1).
But the minimum size of the popular bucket will be:

  PopbktSize = min (〖PopValue〗_i) , i = [1..PopCnt]   

Now, let see the above two histograms and try to understand how bucket size have been identified of them.

 BktSize_18c =  (CDN-PopFreq) / (N-PopCnt)=(100-638)/(80-30)=7.24 

As you see it is the same with the number reported in the 18c trace file. But why the 12c bucket is different?
The answer is that the formula (F.1) identifies the bucket size for HH properly, however there is an important thing need to be considered. The (SQL_HH) returns us a dataset to create HH, but the firstly, the minimum value of the column has to be inserted to the first bucket (as a separate bucket) of HH regardless its popularity and size. So, the minimum value is the first bucket of our HH and its size equal to the frequency of that minimum value. It means we have to consider it in our formula (F.1). In addition the last bucket of the HH is the maximum column value and its size is determined based on size of previous buckets and number of rows in the table.
So, if the minimum column value is popular then we actually, already included it in the formula (F.1), otherwise if it is not a popular value then we have to exclude it in the formula (F.1). Thus, the final bucket size will be:

  If the minimum value is popular then BktSize = ((CDN-PopFreq))/((N-PopCnt) ) else
      BktSize = ((CDN-PopFreq- freq of the minimum value))/((N-PopCnt-1) )

(F.2)

In our case the minimum value is 1 and its frequency is 23, as you know average bucket size is 13 that is why it is a popular value and therefore bucket size should be 7.24 based on the formula (F.2).

But 12c does not consider that fact, it locates the minimum value in the first bucket of the HH and then reduces the bucket size by subtracting its frequency, although it is already included. So:

BktSize_12c =  (CDN-PopFreq-freq of the minimum value) / (N-PopCnt-1)=(1000-638-23)/(80-30-1)= 6.91836…

The undersized bucket prevents to be reached (included) the most popular values, So that, smaller (un-proper) bucket size causes to create 80 (N) number of buckets before reaching the most popular value. If you see the histogram data in 12c, the 79 number of buckets has been created when database reached the value 946. It stopped the process due to reaching the maximum bucket count.

It should be noted that the formula (F.2) is used to identify the initial bucket size, but unpopular bucket size could be even less than that bucket size defined by the (F.2). You can see that if you look at the blog post by Jonathan Lewis. In this example, the bucket size calculated as same in the both version of Oracle Database:

DBMS_STATS:  > cdn 800, popFreq 434, popCnt 5, bktSize 52.14285714285714285714285714285714285714, bktSzFrc .14285714285714285714285714285714285714
DBMS_STATS:  Evaluating hybrid histogram:  cht.count 13, mnb 13, ssize 800, min_ssize 2500, appr_ndv  TRUE, ndv 22, selNdv 1, selFreq 109, pct 100, avg_bktsize 62, csr.hreq TRUE, normalize TRUE
DBMS_STATS:   Histogram gathering flags: 527

The bucket size 52.14 and histograms contain buckets that size less than 52.14, this is because there are very less distinct values after creating 4 buckets (value=19 in 18c histogram). So 13-4=9 buckets had to be created but, we had 22-13 (until 19 there are 13 distinct value) = 9 distinct value after creating the 4 buckets, thus all 9 distinct values should be included in the HH as separate bucket. The another differences of the two histogram is that in 18c Oracle Database preserves last bucket for the maximum value but 12c does not. It was another drawback of the 12c that initial preserving may help database to include most popular values, otherwise merging/removing last bucket can cause the losing of the popular value.

Note: The formula (F.1) also helps us to understand the concept of selectivity for the HH. Each unpopular buckets could have different size, for each unpopular distinct value we will have:

   Rows per unpopular distinct value=unpopular_rows/unpopular_ndv=(CDN-PopFreq) / (NDV-PopCnt)

Summary

The bucket size for hybrid histogram is calculated based on (F.2). It is initial bucket size based on this and the dataset (returned by SQL_HH) the database determines hybrid histogram. But the average bucket size is used to identify popularity of the distinct value. The minimum and maximum values are located in separate buckets. The first bucket contains only minimum column value and its size is the frequency of the minimum value.

Advertisements

October 19, 2018

Join Skew Handling Subject to Filter Conditions-2

Filed under: Cardinality Estimation,CBO,Join,Skew,Tuning — Chinar Aliyev @ 11:33 am

In previous blog post I described a way how to estimate join size correctly. I used two virtual columns in order to avoid processing of the filter predicates explicitly. But a drawback of the method is that for each filter value we are required to create a virtual column. For example, if there are following types of filter predicates like:

dt.object_type = 'TABLE'
dt.object_type = 'INDEX'
dt.object_type in ('TABLE','INDEX')

For each such filter predicates we need to (re)create virtual columns to allow optimizer for computing Join Cardinality properly.
In this blog post I am going to demonstrate another possible way to solve this problem without creating several virtual columns. So, we are going to generalize the method.
I did a little change in the test case:

create table dim_table (type_code number, object_type varchar2(10));
insert into dim_table values (1,'TABLE');
insert into dim_table values (2,'INDEX');
insert into dim_table values (3,'VIEW');
insert into dim_table values (4,'SYNONYM');
insert into dim_table values (5,'OTHER');
commit;
exec dbms_stats.gather_table_stats(null, 'dim_table');
drop table t1;

create table t1 
nologging
as
select
        decode(owner, 'SYS',1,'MDSYS',2,'SYSTEM',3,'ORDDATA',4,5) owner_code ,object_id, object_name, 
        decode(object_type, 'TABLE',1,'INDEX',2,'VIEW',3,'SYNONYM',4,5) type_code 
from
        dba_objects
where
        rownum <= 200000;
 
exec dbms_stats.gather_table_stats(null, 't1', cascade=>true,method_opt=>'for all columns size 254');

We are going to interpret the following SQL and its execution plan

select 
       count(*)
from
        t1, dim_table dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type in ('TABLE','INDEX')
and     t1.owner_code in (2,3);

(SQL 1)

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

Plan hash value: 3591903594                                                                                           
                                                                                                                      
----------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |      1 |        |      1 |00:00:00.02 |     526 |       |       |          |
|   1 |  SORT AGGREGATE     |           |      1 |      1 |      1 |00:00:00.02 |     526 |       |       |          |
|*  2 |   HASH JOIN         |           |      1 |   1219 |    644 |00:00:00.01 |     526 |  2546K|  2546K|  788K (0)|
|*  3 |    TABLE ACCESS FULL| DIM_TABLE |      1 |      2 |      2 |00:00:00.01 |       6 |       |       |          |
|*  4 |    TABLE ACCESS FULL| T1        |      1 |   3048 |   3048 |00:00:00.01 |     519 |       |       |          |
----------------------------------------------------------------------------------------------------------------------
                                                                                                                      
Predicate Information (identified by operation id):                                                                   
---------------------------------------------------                                                                   
                                                                                                                      
   2 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")                                                                      
   3 - filter(("DT"."OBJECT_TYPE"='INDEX' OR "DT"."OBJECT_TYPE"='TABLE'))                                             
   4 - filter(("T1"."OWNER_CODE"=2 OR "T1"."OWNER_CODE"=3)) 

(EP1)                                                          

Now I am going to create the following extended statistics

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS( null,'dim_table',
  METHOD_OPT =>'FOR ALL COLUMNS SIZE 254 ' ||
                'FOR COLUMNS SIZE 254 (type_code,object_type)' );
DBMS_STATS.GATHER_TABLE_STATS( null,'t1',
  METHOD_OPT =>'FOR ALL COLUMNS SIZE 254 ' ||
                'FOR COLUMNS SIZE 254 (type_code,owner_code)' );
END;
/

Instead of creating virtual columns (extended stats) that include values of filter predicates, we have created extended statistics for join and filter columns without specifying their exact values. Because we are trying to generalize our approach. Let`s see column statistics.

SELECT   table_name,
           column_name,
           num_distinct,
           histogram
    FROM   user_tab_col_statistics
   WHERE   table_name IN ('DIM_TABLE', 'T1')
ORDER BY   table_name

TABLE_NAME   COLUMN_NAME                              NUM_DISTINCT          HISTOGRAM
-----------  -------------------------------------    ----------------     --------------
DIM_TABLE    SYS_STUAA5GDQVG143JAEWWJJ8DEHC              5                  FREQUENCY
DIM_TABLE    OBJECT_TYPE                                 5                  FREQUENCY
DIM_TABLE    TYPE_CODE                                   5                  FREQUENCY
T1           OBJECT_NAME                             60608                  HYBRID
T1           OBJECT_ID                               73673                  HYBRID
T1           OWNER_CODE                                  5                  FREQUENCY
T1           TYPE_CODE                                   5                  FREQUENCY
T1           SYS_STUYID1$318FLYXAF1AITPI_H9             23                  FREQUENCY

The only join and filter columns separately do not allow us to estimate join size. The reason is that after applying filters optimizer does not have any idea about value range of join columns. But the extended statistics(ES) help us to solve this problem. As you see we have two extended statistics here: DIM_TABLE.SYS_STUAA5GDQVG143JAEWWJJ8DEHC, T1.SYS_STUYID1$318FLYXAF1AITPI_H9 their data distribution (there are frequency histograms for each extended statistic) give us an opportunity to find join columns value ranges after applying filter predicates and estimate join size accurately.
Let`s see their histogram data:

SELECT   endpoint_value val,
         endpoint_number - NVL (prev_endpoint, 0) freq,
         endpoint_number ep
  FROM   (SELECT   endpoint_number,
                   NVL (
                       LAG (endpoint_number, 1)
                           OVER (ORDER BY endpoint_number),
                       0)
                       prev_endpoint,
                   endpoint_value
            FROM   user_tab_histograms
           WHERE   table_name = 'DIM_TABLE'
                   AND column_name = 'SYS_STUAA5GDQVG143JAEWWJJ8DEHC')

       VAL       FREQ         EP
---------- ---------- ----------
1091848468          1          1
3944801947          1          2
6488719988          1          3
7817432188          1          4
8929074769          1          5

SELECT   endpoint_value val,
         endpoint_number - NVL (prev_endpoint, 0) freq,
         endpoint_number ep
  FROM   (SELECT   endpoint_number,
                   NVL (
                       LAG (endpoint_number, 1)
                           OVER (ORDER BY endpoint_number),
                       0)
                       prev_endpoint,
                   endpoint_value
            FROM   user_tab_histograms
           WHERE   table_name = 'T1'
                   AND column_name = 'SYS_STUYID1$318FLYXAF1AITPI_H9')

       VAL       FREQ         EP
---------- ---------- ----------
 331414609       4616       4616
1394367859         99       4715
1598248696       1520       6235
1716676118       2123       8358
1977102303       1650      10008
2498387478       6625      16633
2724409473        205      16838
2767490637        164      17002
3007640327         25      17027
3277363539          8      17035
3306944241         22      17057
3437869372        156      17213
3763665789        474      17687
3897101470        703      18390
6005663600         90      18480
7155777133      42233      60713
7410806291      12187      72900
7713001505        155      73055
7792531372         10      73065
7894566276        147      73212
8044813761        136      73348
9156127441        310      73658
9960197936         16      73674

23 rows selected.

But the column values are not useful directly until we use MOD(sys_op_combined_hash (val1,val2)) expression. For example, if we are interested in to find join column values of t1.type_code after applying owner_code in (2,3) filter then we can use following query:

SELECT   *
                      FROM   (SELECT   endpoint_value t1_type_code,
                                       endpoint_number
                                       - NVL (prev_endpoint, 0)
                                           t1_type_code_freq,
                                       endpoint_number ep_t1_type_code
                                FROM   (SELECT   endpoint_number,
                                                 NVL (
                                                     LAG (
                                                         endpoint_number,
                                                         1)
                                                         OVER (
                                                             ORDER BY
                                                                 endpoint_number),
                                                     0)
                                                     prev_endpoint,
                                                 endpoint_value
                                          FROM   user_tab_histograms
                                         WHERE   table_name = 'T1'
                                                 AND column_name =
                                                        'TYPE_CODE'))
                             t1_hist_type_code,
                             (SELECT   endpoint_value t1_vc,
                                       endpoint_number
                                       - NVL (prev_endpoint, 0)
                                           t1_vc_freq,
                                       endpoint_number ep_t1_vc
                                FROM   (SELECT   endpoint_number,
                                                 NVL (
                                                     LAG (
                                                         endpoint_number,
                                                         1)
                                                         OVER (
                                                             ORDER BY
                                                                 endpoint_number),
                                                     0)
                                                     prev_endpoint,
                                                 endpoint_value
                                          FROM   user_tab_histograms
                                         WHERE   table_name = 'T1'
                                                 AND column_name =
                                                        'SYS_STUYID1$318FLYXAF1AITPI_H9'))
                             t1_hist_vc
                     WHERE   t1_hist_vc.t1_vc IN
                                     (MOD (
                                          sys_op_combined_hash (
                                              t1_hist_type_code.t1_type_code,
                                              2),
                                          9999999999),
                                      MOD (
                                          sys_op_combined_hash (
                                              t1_hist_type_code.t1_type_code,
                                              3),
                                          9999999999)
                                          )
T1_TYPE_CODE T1_TYPE_CODE_FREQ            EP      T1_VC         T1_VC_FREQ   EP_T1_VC
------------ -----------------      ----------   ----------     ----------   ----------
           1              2203            2203   7894566276     147          73212
           2              2869            5072   2724409473     205          16838
           3              7233           12305   1394367859     99           4715
           5             49158           73674   1716676118     2123         8358
           1              2203            2203   8044813761     136          73348
           2              2869            5072   3437869372     156          17213
           3              7233           12305   7792531372     10           73065
           4             12211           24516   3277363539     8            17035
           5             49158           73674   2767490637     164          17002

What does the SQL give us? It identifies the join column values after applying filter predicate and their frequencies that exactly we was going to find it in order to estimate join cardinality. There are T1_TYPE_CODE join column values and T1_VC_FREQ indicates their frequencies. In the same way we can find the appropriate data for dim_table and finally we will be able to produce correct join cardinality using the following SQL:

---JoinCard.sql
SELECT   SUM (dm_vc_freq * t1_vc_freq) "JoinCard"
  FROM   (SELECT   *
            FROM   (SELECT   *
                      FROM   (SELECT   endpoint_number ep_dm_type_code ,endpoint_number-
                                                 NVL (
                                                     LAG (
                                                         endpoint_number,
                                                         1)
                                                         OVER (
                                                             ORDER BY
                                                                 endpoint_number),
                                                     0)
                                                     dm_type_code_freq,
                                                 endpoint_value dm_type_code
                                          FROM   user_tab_histograms
                                         WHERE   table_name = 'DIM_TABLE'
                                                 AND column_name =
                                                        'TYPE_CODE')
                             dm_hist_type_code,
                             (SELECT   endpoint_number ep_dm_vc,
                                                 endpoint_number-NVL (
                                                     LAG (
                                                         endpoint_number,
                                                         1)
                                                         OVER (
                                                             ORDER BY
                                                                 endpoint_number),
                                                     0)
                                                     dm_vc_freq,
                                                 endpoint_value dm_vc
                                          FROM   user_tab_histograms
                                         WHERE   table_name = 'DIM_TABLE'
                                                 AND column_name =
                                                        'SYS_STUAA5GDQVG143JAEWWJJ8DEHC')
                             dm_hist_vc
                     WHERE  /* this part contains values of filter predicates*/   
                          dm_hist_vc.dm_vc IN
                                     (MOD (
                                          sys_op_combined_hash (
                                              dm_hist_type_code.dm_type_code,
                                              'TABLE'),
                                          9999999999),
                                          MOD (
                                          sys_op_combined_hash (
                                              dm_hist_type_code.dm_type_code,
                                              'INDEX'),
                                          9999999999)
                                          )) dm_data,
                                          
                   (SELECT   *
                      FROM   ( SELECT   endpoint_number ep_t1_type_code,
                                                endpoint_number- NVL (
                                                     LAG (
                                                         endpoint_number,
                                                         1)
                                                         OVER (
                                                             ORDER BY
                                                                 endpoint_number),
                                                     0)
                                                     t1_type_code_freq,
                                                 endpoint_value t1_type_code
                                          FROM   user_tab_histograms
                                         WHERE   table_name = 'T1'
                                                 AND column_name =
                                                        'TYPE_CODE') 
                             t1_hist_type_code,
                             ( SELECT   endpoint_number ep_t1_vc,
                                                endpoint_number- NVL (
                                                     LAG (
                                                         endpoint_number,
                                                         1)
                                                         OVER (
                                                             ORDER BY
                                                                 endpoint_number),
                                                     0)
                                                     t1_vc_freq,
                                                 endpoint_value t1_vc
                                          FROM   user_tab_histograms
                                         WHERE   table_name = 'T1'
                                                 AND column_name =
                                                        'SYS_STUYID1$318FLYXAF1AITPI_H9' )
                             t1_hist_vc
                     WHERE   /* this part contains values of filter predicates*/
t1_hist_vc.t1_vc IN
                                     (MOD (
                                          sys_op_combined_hash (
                                              t1_hist_type_code.t1_type_code,
                                              2),
                                          9999999999),
                                      MOD (
                                          sys_op_combined_hash (
                                              t1_hist_type_code.t1_type_code,
                                              3),
                                          9999999999)
                                          )) t1_data
           WHERE   dm_data.dm_type_code = t1_data.t1_type_code)
  JoinCard
----------
       644

SQL>

(SQL2)

As it is seen from (EP1) execution plan actual rows was 644 since we also calculated it as 644 using SQL2.
If we want to execute the SQL for different filter values then we just need to specify the values in SQL2 and we will get the proper cardinality for the join. It means in this case the creation of additional virtual columns will not be required for those filter values.
But, our work has not completed. We just got a number for the join cardinality, so optimizer is not aware of that still. In this case I recommend using OPT_ESTIMATE hint in order to influence the optimizer. Yes you can say that it has not been documented yet. Right, but I cannot say that the hint is not usable. If you look at the “How To Use Optimizer Hints To Specify Cardinality For Join Operation (Doc ID 2402821.1)” MOS note then you can recognize that support actually recommends using OPT_ESTIMATE hint.
As a result:

SELECT   /*+ opt_estimate(join, (t1,dt), rows=644) */
       *
  FROM   t1, dim_table dt
 WHERE       t1.type_code = dt.type_code
         AND dt.object_type IN ('TABLE', 'INDEX')
         AND t1.owner_code IN (2, 3);

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

Plan hash value: 1679886631                                                                                          
                                                                                                                     
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |    644 |00:00:00.01 |     527 |       |       |          |
|*  1 |  HASH JOIN         |           |      1 |    644 |    644 |00:00:00.01 |     527 |  1856K|  1856K|  895K (0)|
|*  2 |   TABLE ACCESS FULL| DIM_TABLE |      1 |      2 |      2 |00:00:00.01 |       6 |       |       |          |
|*  3 |   TABLE ACCESS FULL| T1        |      1 |   3048 |   3048 |00:00:00.01 |     520 |       |       |          |
---------------------------------------------------------------------------------------------------------------------
                                                                                                                     
Predicate Information (identified by operation id):                                                                  
---------------------------------------------------                                                                  
                                                                                                                     
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")                                                                     
   2 - filter(("DT"."OBJECT_TYPE"='INDEX' OR "DT"."OBJECT_TYPE"='TABLE'))                                            
   3 - filter(("T1"."OWNER_CODE"=2 OR "T1"."OWNER_CODE"=3))   

Summary

In the blog post we saw another strategy to estimate join size without creating additional virtual columns per filter value. In our example, there are frequency histogram for both join columns, so we can generalize the SQL2 for different types of histogram combinations. This approach completely solves the cardinality estimation for two table joins. In addition, we saw the power of extended statistics and their usability. As you paid attention, in this and previous cases we avoided to use general formula of join cardinality estimation. Actually, we avoided join selectivity factor. It should be noted that if there are more than two tables are involved in a join then it is not easy to achieve correct join cardinality and that is why the join selectivity plays important role. So, in this case optimizer calculates JS and uses it during the join processing. I will write about an approach to estimate better join cardinality for more than two table joins in a future blog post.

October 2, 2018

Join Skew Handling Subject to Filter Conditions

Filed under: Cardinality Estimation,CBO,Join,Skew,Tuning — Chinar Aliyev @ 1:52 pm

I have written about the problem in my previous post but re-raising the mentioned problem by Jonathan Lewis enforced me to think about other possible solution of it. A suggested solution in Jonathan`s blog is applying fact table`s column distribution to the dimension table`s column. In my opinion, if a column distribution will not express its data by faked statistics, then it is not a good idea to apply the method to production systems. As he said the method does not work as expected from 12 R2 afterward. For some predicate values it works, but for other does not.
In this blog post I am going to describe another possible and supported way to correct the cardinality estimation of the join.
Let`s create the test case again

// tested in Oracle Database version: 18.3.0.0.0, 12.2.0.1.0, 12.1.0.2.0

create table dim_table (type_code number, object_type varchar2(10));
insert into dim_table values (1,'TABLE');
insert into dim_table values (2,'INDEX');
insert into dim_table values (3,'VIEW');
insert into dim_table values (4,'SYNONYM');
insert into dim_table values (5,'OTHER');
commit;
-- I do not create primary index for dim_table, because we consider the case for general purpose

exec dbms_stats.gather_table_stats(user,'dim_table');

create table t1 
nologging
as 
select 
        object_id, object_name, 
        decode(object_type, 'TABLE',1,'INDEX',2,'VIEW',3,'SYNONYM',4,5) type_code 
from 
        all_objects
where
where
        rownum <= 50000;
  
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
  
exec dbms_stats.gather_table_stats(null, 't1', cascade=>true,method_opt=>'for all columns size 254');

The value of type_code is 1 for object_type=’TABLE’ then

explain plan for
select  *
from
        t1
where
        t1.type_code = 1; 

(SQL 1)

select * from table(dbms_xplan.display(null,null,null));

Plan hash value: 3617692013                                               
                                                                          
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 14336 |   602K|   726   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 14336 |   602K|   726   (1)| 00:00:01 |
--------------------------------------------------------------------------
                                                                          
Predicate Information (identified by operation id):                       
---------------------------------------------------                       
                                                                          
   1 - filter("T1"."TYPE_CODE"=1)   
  EP1

It means in my case, the cardinality of the following join should be calculated as 14336, but:

explain plan for
select 
        t1.object_id
from
        t1, dim_table dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE';

(SQL 2)

select * from table(dbms_xplan.display(null,null,null));
Plan hash value: 1679886631                                                     
                                                                                
--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 80000 |  1328K|   728   (1)| 00:00:01 |
|*  1 |  HASH JOIN         |           | 80000 |  1328K|   728   (1)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| DIM_TABLE |     1 |     9 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1        |   400K|  3125K|   725   (1)| 00:00:01 |
--------------------------------------------------------------------------------
                                                                                
Predicate Information (identified by operation id):                             
---------------------------------------------------                             
                                                                                
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")                                
   2 - filter("DT"."OBJECT_TYPE"='TABLE')                                       
EP2

As in the previous blog posts was mentioned that in this case optimizer doesn’t estimate cardinality properly due to several of factors. First of all, we cannot say that in the case optimizer doesn`t consider the column distribution for estimating the join size. In contrast, it calculates join selectivity (JS) based on join columns distribution, so histograms data of the join columns are used to calculate JS, in other words these are used for the join cardinality. Let’s prove it then we will return our base query.

create table dim_table2 (type_code number, object_type varchar2(10));
 
insert into dim_table2 values (1,'TABLE');
insert into dim_table2 values (1,'TABLE');
insert into dim_table2 values (2,'INDEX');
insert into dim_table2 values (3,'VIEW');
insert into dim_table2 values (4,'SYNONYM');
insert into dim_table2 values (5,'OTHER');

commit;
 

exec dbms_stats.gather_table_stats(user,'dim_table2',method_opt=&gt;'for all columns size skewonly');
explain plan for
select 
        t1.object_id
from
        t1, dim_table2 dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE';

(SQL 3)

select * from table(dbms_xplan.display(null,null,null));

Plan hash value: 1964253718                                                      
                                                                                 
---------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |   138K|  2292K|   728   (1)| 00:00:01 |
|*  1 |  HASH JOIN         |            |   138K|  2292K|   728   (1)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| DIM_TABLE2 |     2 |    18 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1         |   400K|  3125K|   725   (1)| 00:00:01 |
---------------------------------------------------------------------------------
                                                                                 
Predicate Information (identified by operation id):                              
---------------------------------------------------                              
                                                                                 
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")                                 
   2 - filter("DT"."OBJECT_TYPE"='TABLE')
EP3

As you see, although number of distinct values of the columns (NDV) has not been changed but Join cardinality has been changed. NDV is 5 for both columns, it means Join Cardinality should be

JCARD=Filtered_CARD(DIM_TABLE2)* Filtered_CARD(T1)/5=2*400k/5=160000. 

But it is not 160K, it was calculated 138K! The reason is that JS is calculated based on frequency information but not just based on NDVs in this case due to histograms.
Now if optimizer considers column distribution then what was wrong with (SQL 1) statement? The answer is the filter predicates. These directly influence the join cardinality and JS doesn`t depend on filter predicates. I pointed out it two years ago at Hotsos symposium 2016 – in this the paper. It means, if we find a way to remove filter predicates from the SQL1 in order to completely eliminate the influencing the join cardinality estimation then we will attain the desired result.
But how we could achieve it? The first thing come to my mind was using extended statistics and rewriting SQL statement to get the result. I thought it should work. So, what I have done? I have added following virtual column to the table and gathered histogram for that.

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS( null,'dim_table',
  METHOD_OPT =&gt; 'FOR ALL COLUMNS SIZE 254 ' ||
                'FOR COLUMNS SIZE 254 (CASE "OBJECT_TYPE" WHEN ''TABLE'' THEN "TYPE_CODE"  END )' );
END;
/

select column_name,
       num_distinct,
       num_buckets,
       histogram from user_tab_col_statistics
where table_name='DIM_TABLE'

COLUMN_NAME                          NUM_DISTINCT NUM_BUCKETS  HISTOGRAM
------------------------------------ ------------ -----------  -----------
TYPE_CODE                            5            5            FREQUENCY
OBJECT_TYPE                          5            5            FREQUENCY
SYS_STU#S#WF25Z#QAHIHE#MOFFMM_       1            1            FREQUENCY

As you see we have frequency histogram for each column including our virtual column. It is very important having histogram of the columns to take advantage of them in order to calculate JC correctly.
Now we can rewrite the original SQL1 with one following SQL statements and can see the estimation.

SELECT   t1.object_id
  FROM   t1, dim_table dt
 WHERE   t1.type_code =
             CASE dt.object_type 
             WHEN 'TABLE' THEN dt.type_code 
             END;
 (SQL4)

OR 

SELECT   t1.object_id
  FROM   t1, dim_table dt
 WHERE   t1.type_code = dt."SYS_STU#S#WF25Z#QAHIHE#MOFFMM1_"; 
 (SQL5)

These SQL2, SQL4, SQL5 are semantically equivalent queries. Explain plans of both SQL4 and SQL5 are same:

Plan hash value: 1679886631                                                     
                                                                                
--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 80000 |   781K|   728   (1)| 00:00:01 |
|*  1 |  HASH JOIN         |           | 80000 |   781K|   728   (1)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| DIM_TABLE |     1 |     2 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1        |   400K|  3125K|   725   (1)| 00:00:01 |
--------------------------------------------------------------------------------
                                                                                
Predicate Information (identified by operation id):                             
---------------------------------------------------                             
                                                                                
   1 - access("T1"."TYPE_CODE"=CASE "OBJECT_TYPE" WHEN 'TABLE' THEN             
              "TYPE_CODE" END )                                                 
   2 - filter(CASE "OBJECT_TYPE" WHEN 'TABLE' THEN "TYPE_CODE" END  IS          
              NOT NULL)   
EP4

Unfortunately, we have not got the expected cardinality. What is the reason? Is something wrong with this method?
Let`s look at the optimizer trace file to catch the problem.

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: DIM_TABLE  Alias: DT
  #Rows: 5  SSZ: 0  LGR: 0  #Blks:  1  AvgRowLen:  10.00  NEB: 0  ChainCnt:  0.00  ScanRate:  0.00  SPC: 0  RFL: 0  RNF: 0  CBK: 0  CHR: 0  KQDFLG: 1
  #IMCUs: 0  IMCRowCnt: 0  IMCJournalRowCnt: 0  #IMCBlocks: 0  IMCQuotient: 0.000000
  Column (#3): SYS_STU#S#WF25Z#QAHIHE#MOFFMM_(NUMBER)
    AvgLen: 2 NDV: 1 Nulls: 4 Density: 0.500000 Min: 1.000000 Max: 1.000000
***********************
Table Stats::
  Table: T1  Alias: T1
  #Rows: 400000  SSZ: 0  LGR: 0  #Blks:  2664  AvgRowLen:  43.00  NEB: 0  ChainCnt:  0.00  ScanRate:  0.00  SPC: 0  RFL: 0  RNF: 0  CBK: 0  CHR: 0  KQDFLG: 1
  #IMCUs: 0  IMCRowCnt: 0  IMCJournalRowCnt: 0  #IMCBlocks: 0  IMCQuotient: 0.000000
  Column (#3): TYPE_CODE(NUMBER)
    AvgLen: 3 NDV: 5 Nulls: 0 Density: 0.017180 Min: 1.000000 Max: 5.000000
    Histogram: Freq  #Bkts: 5  UncompBkts: 400000  EndPtVals: 5  ActualVal: yes
try to generate single-table filter predicates from ORs for query block SEL$1 (#0)
finally: "T1"."TYPE_CODE"=CASE "DT"."OBJECT_TYPE" WHEN 'TABLE' THEN "DT"."TYPE_CODE" END  AND CASE "DT"."OBJECT_TYPE" WHEN 'TABLE' THEN "DT"."TYPE_CODE" END  IS NOT NULL

It is seen that optimizer detected virtual column for join processing but there is something missing. The trace file displays all basic column statistics for the virtual column but doesn`t talk about its frequency histogram! And why? The answer is that there is one bucket and there is only one number of distinct value for the virtual column. Therefore optimizer thought it is not necessary to look at the histogram data.
So, what is the next? What can we do for the next step?
We have one distinct value and four nulls for the column, let`s replace the null values with the other values (which not included in the table) in order to create more buckets for histogram. In this case, such histogram will trigger optimizer to consider frequency information from the histogram:

BEGIN
    DBMS_STATS.drop_extended_stats (
        NULL,
        'dim_table',
        '(CASE "OBJECT_TYPE" WHEN ''TABLE'' THEN "TYPE_CODE" END )');
END;
/
BEGIN
  DBMS_STATS.GATHER_TABLE_STATS( null,'dim_table',
  METHOD_OPT =&gt; 'FOR ALL COLUMNS SIZE 254 ' ||
                'FOR COLUMNS SIZE 254 (CASE "OBJECT_TYPE" WHEN ''TABLE'' THEN "TYPE_CODE" ELSE (-1) END )' );
END;
/

select column_name,
       num_distinct,
       num_buckets,
       histogram from user_tab_col_statistics
where table_name='DIM_TABLE'

COLUMN_NAME                          NUM_DISTINCT NUM_BUCKETS  HISTOGRAM
------------------------------------ ------------ -----------  -----------
TYPE_CODE                            5            5            FREQUENCY
OBJECT_TYPE                          5            5            FREQUENCY
SYS_STUE_0VSFXQ_4T7QQLFLMZRDPV       2            2            FREQUENCY

Now, let`s see the execution plan of :

explain plan for
SELECT   t1.object_id
  FROM   t1, dim_table dt
 WHERE   t1.type_code =
             CASE dt.object_type 
             WHEN 'TABLE' THEN dt.type_code else -1 
             END;
select * from table(dbms_xplan.display);
Plan hash value: 1679886631                                                     
                                                                                
--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 14336 |   168K|   728   (1)| 00:00:01 |
|*  1 |  HASH JOIN         |           | 14336 |   168K|   728   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| DIM_TABLE |     5 |    20 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1        |   400K|  3125K|   725   (1)| 00:00:01 |
--------------------------------------------------------------------------------
                                                                                
Predicate Information (identified by operation id):                             
---------------------------------------------------                             
                                                                                
   1 - access("T1"."TYPE_CODE"=CASE "OBJECT_TYPE" WHEN 'TABLE' THEN             
              "TYPE_CODE" ELSE (-1) END )                                       
  EP5
explain plan for
SELECT   t1.object_id
  FROM   t1, dim_table dt
 WHERE   t1.type_code = dt." SYS_STUE_0VSFXQ_4T7QQLFLMZRDPV"; 
 select * from table(dbms_xplan.display);
Plan hash value: 1679886631                                                     
                                                                                
--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 14336 |   168K|   728   (1)| 00:00:01 |
|*  1 |  HASH JOIN         |           | 14336 |   168K|   728   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| DIM_TABLE |     5 |    20 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1        |   400K|  3125K|   725   (1)| 00:00:01 |
--------------------------------------------------------------------------------
                                                                                
Predicate Information (identified by operation id):                             
---------------------------------------------------                             
                                                                                
   1 - access("T1"."TYPE_CODE"=CASE "OBJECT_TYPE" WHEN 'TABLE' THEN             
              "TYPE_CODE" ELSE (-1) END )   

As you see the last explain plans do not contain filter predicate anymore and we have got the correct cardinality. And the trace file shows that optimizer was going to use histogram of the virtual column for join processing:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: DIM_TABLE  Alias: DT
  #Rows: 5  SSZ: 0  LGR: 0  #Blks:  1  AvgRowLen:  13.00  NEB: 0  ChainCnt:  0.00  ScanRate:  0.00  SPC: 0  RFL: 0  RNF: 0  CBK: 0  CHR: 0  KQDFLG: 1
  #IMCUs: 0  IMCRowCnt: 0  IMCJournalRowCnt: 0  #IMCBlocks: 0  IMCQuotient: 0.000000
  Column (#3): SYS_STUE_0VSFXQ_4T7QQLFLMZRDPV (NUMBER)
    AvgLen: 4 NDV: 2 Nulls: 0 Density: 0.100000 Min: -1.000000 Max: 1.000000
    Histogram: Freq  #Bkts: 2  UncompBkts: 5  EndPtVals: 2  ActualVal: yes
***********************
Table Stats::
  Table: T1  Alias: T1
  #Rows: 400000  SSZ: 0  LGR: 0  #Blks:  2664  AvgRowLen:  43.00  NEB: 0  ChainCnt:  0.00  ScanRate:  0.00  SPC: 0  RFL: 0  RNF: 0  CBK: 0  CHR: 0  KQDFLG: 1
  #IMCUs: 0  IMCRowCnt: 0  IMCJournalRowCnt: 0  #IMCBlocks: 0  IMCQuotient: 0.000000
  Column (#3): TYPE_CODE(NUMBER)
    AvgLen: 3 NDV: 5 Nulls: 0 Density: 0.017180 Min: 1.000000 Max: 5.000000
    Histogram: Freq  #Bkts: 5  UncompBkts: 400000  EndPtVals: 5  ActualVal: yes
try to generate single-table filter predicates from ORs for query block SEL$1 (#0)
finally: "T1"."TYPE_CODE"=CASE "DT"."OBJECT_TYPE" WHEN 'TABLE' THEN "DT"."TYPE_CODE" ELSE (-1) END 

Note: there is another way to solve the problem and will be published in a future blog post, stay tuned 🙂

September 20, 2018

A reason why SQL Plan Baseline is not used

Filed under: Bug,CBO,SQL Plan Baseline — Chinar Aliyev @ 7:15 am

Recently, I have seen a case that SQL Plan Baseline (SPB) has not been used by optimizerThere have not been performed any changes (in schema level and also system level). But optimizer just used to ignore existing SPB and switched to cost-based decision in order to produce an execution plan for a SQL statement. Also, it is seen that dba_sql_plan_baselines.reproduced column was set to NO for this SPB. SPB was created for the SQL several months ago and it was used by optimizer as expected, but recently optimizer has started to ignore it. In addition we do not have any chance to change SQL text, it is requested by a specific application. Environment: OS – Linux x86 64-bit, Oracle DB – 12.1.0.2.0.
Let`s see the SQL and its execution plans (without predicate section)
The SQL text is something like:

SELECT NVL (SUM (U.VAL), 0) VALUE
  FROM (SELECT -A.VAL , B.DAT
          FROM TE A, TD B
         WHERE     A.BR = :B3
               AND A.DBT = :B2
               AND A.DC &gt; :B1
               AND B.BR = A.BR
               AND B.DC = A.DC
               AND B.NDC IS NULL
        UNION ALL
        SELECT A.VAL, B.DAT
          FROM TE A, TD B
         WHERE     A.BR = :B3
               AND A.CRD = :B2
               AND A.DC &gt; :B1
               AND B.BR = A.BR
               AND B.DC = A.DC
               AND B.NDC IS NULL) U
 WHERE U.DAT >= :B4

Current execution plan:

Plan hash value: 664049386                                                                                       
                                                                                                                 
---------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                                | Name                | Rows  | Bytes | Cost (%CPU)| Time     | 
---------------------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT                         |                     |       |       |    14 (100)|          | 
|   1 |  SORT AGGREGATE                          |                     |     1 |    59 |            |          | 
|   2 |   NESTED LOOPS                           |                     |     1 |    59 |    14   (0)| 00:00:01 | 
|   3 |    NESTED LOOPS                          |                     |     2 |    59 |    14   (0)| 00:00:01 | 
|   4 |     VIEW                                 | VW_JF_SET$5F0EEC53  |     2 |    78 |    10   (0)| 00:00:01 | 
|   5 |      UNION-ALL                           |                     |       |       |            |          | 
|   6 |       TABLE ACCESS BY INDEX ROWID BATCHED| TE                  |     1 |    33 |     5   (0)| 00:00:01 | 
|   7 |        INDEX RANGE SCAN                  | IDX_TE_ENTGR        |     1 |       |     4   (0)| 00:00:01 | 
|   8 |       TABLE ACCESS BY INDEX ROWID BATCHED| TE                  |     1 |    29 |     5   (0)| 00:00:01 | 
|   9 |        INDEX RANGE SCAN                  | IDX_TE_ENTGR        |     1 |       |     4   (0)| 00:00:01 | 
|  10 |     INDEX UNIQUE SCAN                    | IDX_TD_D            |     1 |       |     2   (0)| 00:00:01 | 
|  11 |    TABLE ACCESS BY INDEX ROWID           | TD                  |     1 |    20 |     2   (0)| 00:00:01 | 
---------------------------------------------------------------------------------------------------------------- 
                                                                                                                 
Note                                                                                                             
-----                                                                                                            
   - this is an adaptive plan      

Execution plan from SPB.

select * from table(dbms_xplan.display_sql_plan_baseline('SQL_74d0023a19b45cf8'))   
                                                                                                                 
Plan hash value: 443824370                                                                                       
                                                                                                                 
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                      |       |       |    85 (100)|          |
|   1 |  SORT AGGREGATE                          |                      |     1 |    59 |            |          |
|   2 |   NESTED LOOPS                           |                      |     1 |    59 |    85   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                          |                      |     2 |    59 |    85   (0)| 00:00:01 |
|   4 |     VIEW                                 | VW_JF_SET$5F0EEC53   |     2 |    78 |    79   (0)| 00:00:01 |
|   5 |      UNION-ALL                           |                      |       |       |            |          |
|   6 |       TABLE ACCESS BY INDEX ROWID BATCHED| TE                   |     1 |    33 |    42   (0)| 00:00:01 |
|   7 |        INDEX RANGE SCAN                  | IDX_TE_D             |    64 |       |     4   (0)| 00:00:01 |
|   8 |       TABLE ACCESS BY INDEX ROWID BATCHED| TE                   |     1 |    29 |    37   (0)| 00:00:01 |
|   9 |        INDEX RANGE SCAN                  | IDX_TE_C             |    57 |       |     4   (0)| 00:00:01 |
|  10 |     INDEX UNIQUE SCAN                    | IDX_TD_D             |     1 |       |     2   (0)| 00:00:01 |
|  11 |    TABLE ACCESS BY INDEX ROWID           | TD                   |     1 |    20 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------    

As you see, both execution plan show that the database performs join factorization in order to eliminate extra scan of TD table, but it uses different indexes to produce the plans. Of course the cost of SPB is greater than that of current execution plan. I am not going to talk about why it is and cardinality calculations of index scans here. But, optimizer should select the SPB even its cost is bigger than that of available plans.
To understand the problem we need to do trace of SPM and it is a part optimizer trace. If SQL statement is in the library cache then we can use dbms_sqldiag.dump_trace procedure to obtain trace for existing cursor else we ca use alter system set events ‘trace [rdbms.SQL_Optimizer.*][sql:anb0kap085nf8]’ statement for that purpose.

begin
       dbms_sqldiag.dump_trace(p_sql_id=>'anb0kap085nf8',
                               p_child_number=>2,  
                               p_component=>'Compiler',
                               p_file_id=>'OPTRC');
end;
 /

There is child number 2 we want to dump its execution plan, because in this case optimizer did not choose SPB.
Now let`s see the context of the trace file. The following lines indicate that the appropriate plan was found in SQL Plan Management Base and was being tried to produce it.

Registered qb: SEL$3 0x66ce1d20 (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$3 nbfros=2 flg=0
    fro(0): flg=4 objn=79226 hint_alias="A"@"SEL$3"
    fro(1): flg=4 objn=79220 hint_alias="B"@"SEL$3"

SPM: statement found in SMB
..........................
SPM: finding a match for the generated plan, planId = ...
SPM: fixed planId`s of plan baseline are: 490788028 
SPM: using qksan to reproduce, cost and select accepted plan, sig = 7420456529887702107
SPM: plan reproducibility round 1 (plan outline + session OFE)
SPM: using qksan to reproduce accepted plan, planId = 490788028
  Bind List follows:
.....

But following lines of the trace show that the database failed to produce it due to the error, ORA-00942: table or view does not exist

 
SPM: recursive compile failed so skipping current accepted plan, planName = SQL_PLAN_6dyq1ws7kyb2v1d40d4bc, sig = 7420456529887702107
SPM: OCIcode = 942, OCImesg = ORA-00942: table or view does not exist

SPM: current user id=74, name=...
SPM: login user id=3406, name=...
SPM: schema user id=74, name=...
SPM: stmt with r. compile error: SELECT NVL(SUM(U.VAL), 0) VALUE FROM (SELECT -A.VAL , B.DAT FROM TE A, TD B WHERE A.BR = :B3 AND A.DBT = :B2 AND A.DC &gt; :B1 AND B.BR = A.BR AND B.DC = A.DC AND B.NDC IS NULL UNION ALL SELECT A.VAL, B.DAT FROM TE A, TD B WHERE A.BR = :B3 AND A.CRD = :B2 AND A.DC &gt; :B1 AND B.BR = A.BR AND B.DC = A.DC AND B.NDC IS NULL) U WHERE U.DAT &gt;= :B4 
SPM: plan reproducibility round 2 (hinted OFE only)
SPM: using qksan to reproduce accepted plan, planId = 490788028
  Bind List follows:
....
SPM: recursive compile failed so skipping current accepted plan, planName = SQL_PLAN_6dyq1ws7kyb2v1d40d4bc, sig = 7420456529887702107
SPM: OCIcode = 942, OCImesg = ORA-00942: table or view does not exist

SPM: current user id=74, name=...
SPM: login user id=3406, name=...
SPM: schema user id=74, name=...
SPM: stmt with r. compile error: SELECT NVL(SUM(U.VAL), 0) VALUE FROM (SELECT -A.VAL , B.DAT FROM TE A, TD B WHERE A.BR = :B3 AND A.DBT = :B2 AND A.DC &gt; :B1 AND B.BR = A.BR AND B.DC = A.DC AND B.NDC IS NULL UNION ALL SELECT A.VAL, B.DAT FROM TE A, TD B WHERE A.BR = :B3 AND A.CRD = :B2 AND A.DC &gt; :B1 AND B.BR = A.BR AND B.DC = A.DC AND B.NDC IS NULL) U WHERE U.DAT &gt;= :B4 

As it failed then optimizer set reproduced status to NO (we can see it from dba_sql_plan_baselines.reproduced ) and started to generate the execution plan via cost-based decision.

SPM: change REPRODUCED status to NO, planName = SQL_PLAN_6dyq1ws7kyb2v8b690358
SPM: REPRODUCED status changes: cntRepro = 1, bitvecRepro = 000
SPM: couldn`t reproduce any enabled+accepted plan so using the cost-based plan, planId = 3220595646
SPM: re-parse to use selected accepted plan, planId = 3220595646
kkfdapdml:0 oct=3, pgadep=2, pgapls=1, uid=74, recursive ok? No

SPM: finding a match for the generated plan, planId = 4180781194
SPM: failed to reproduce selected accepted plan on re-parse, planId = 4180781194
------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : ...
  plan_baseline signature  : 7420456529887702107
  plan_baseline plan_id    : 2338915160
SPM: generated non-matching plan:
----- Explain Plan Dump -----
----- Plan Table -----
 
============
Plan Table
============
-------------------------------------------------------------------------+-----------------------------------+
| Id  | Operation                                   | Name               | Rows  | Bytes | Cost  | Time      |
-------------------------------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT                            |                    |       |       |    14 |           |
| 1   |  SORT AGGREGATE                             |                    |     1 |    59 |       |           |
| 2   |   HASH JOIN                                 |                    |     1 |    59 |    14 |  00:00:01 |
| 3   |    NESTED LOOPS                             |                    |     1 |    59 |    14 |  00:00:01 |
| 4   |     NESTED LOOPS                            |                    |     2 |    59 |    14 |  00:00:01 |
| 5   |      STATISTICS COLLECTOR                   |                    |       |       |       |           |
| 6   |       VIEW                                  | VW_JF_SET$5F0EEC53 |     2 |    78 |    10 |  00:00:01 |
| 7   |        UNION-ALL                            |                    |       |       |       |           |
| 8   |         TABLE ACCESS BY INDEX ROWID BATCHED | TE                 |     1 |    33 |     5 |  00:00:01 |
| 9   |          INDEX RANGE SCAN                   | IDX_TE_ENTGR       |     1 |       |     4 |  00:00:01 |
| 10  |         TABLE ACCESS BY INDEX ROWID BATCHED | TENTRY             |     1 |    29 |     5 |  00:00:01 |
| 11  |          INDEX RANGE SCAN                   | IDX_TE_ENTGR       |     1 |       |     4 |  00:00:01 |
| 12  |      INDEX UNIQUE SCAN                      | IDX_TD_D           |     1 |       |     2 |  00:00:01 |
| 13  |     TABLE ACCESS BY INDEX ROWID             | TD                 |     1 |    20 |     2 |  00:00:01 |
| 14  |    TABLE ACCESS BY INDEX ROWID BATCHED      | TD                 |     1 |    20 |     2 |  00:00:01 |
| 15  |     INDEX SKIP SCAN                         | IDX_TD_DAT         |     1 |       |     2 |  00:00:01 |
-------------------------------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
2 - access("B"."BR"="ITEM_2" AND "B"."DC"="ITEM_1")
8 - filter("A"."DBT"=:B2)
9 - access("A"."BR"=:B3 AND "A"."DC">:B1)
10 - filter("A"."CRD"=:B2)
11 - access("A"."BR"=:B3 AND "A"."DC">:B1)
12 - access("B"."BR"="ITEM_2" AND "B"."DC"="ITEM_1")
13 - filter(("B"."DAT">=:B4 AND "B"."NDC" IS NULL))
14 - filter("B"."NDC" IS NULL)
15 - access("B"."DAT">=:B4)
15 - filter("B"."DAT">=:B4)
....
Content of other_xml column
===========================
  db_version     : 12.1.0.2
  parse_schema   : ...
  plan_hash_full : 1239650874
  plan_hash      : 664049386
  plan_hash_2    : 4180781194        

The next question is why we had ORA-00942: table or view does not exist error and what causes to happen it?
To find it I also enabled SQL Trace for the session that I used to generate optimizer trace file via dbms_sqldiag.dump_trace. And I found following information in the composite trace file:

kkoadsTimeLimitFromSrc(Exit) timeLimit=0
kkoadsTimeLimit: source=Voodoo timeLimit=10
kkoadsMaxTabCount: source=auto maxTabCnt=2
  kkoadsSetTimeSlice: #candTabs=4 slice=625
=====================
PARSE ERROR #140084361341696:len=296 dep=3 uid=74 oct=2 lid=74 tim=9296515611962 err=942
insert into ora_debug_table                                                   (time, txt0, txt1, txt2, txt3, txt4,                                           num0, num1, num2, num3, num4, num5, num6, num7, num8, num9)                  values(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16)
WAIT #140084361341696: nam='Disk file operations I/O' ela= 62 FileOperation=8 fileno=0 filetype=8 obj#=68 tim=9296515612111
CLOSE #140084361341696:c=0,e=6,dep=3,type=0,tim=9296515612417
CLOSE #140084361348056:c=0,e=54,dep=2,type=0,tim=9296515612457
CLOSE #140084363314552:c=0,e=4,dep=2,type=0,tim=9296515612477
SPM: recursive compile failed so skipping current accepted plan, planName = SQL_PLAN_6dyq1ws7kyb2v1d40d4bc, sig = 7420456529887702107
SPM: OCIcode = 942, OCImesg = ORA-00942: table or view does not exist

As you see the database did not find ora_debug_table table and that is why it could not produce the desired plan. In the database there is not such table. I created global temporary table with that name in SYS schema and it seems problem has been solved.

CREATE GLOBAL TEMPORARY TABLE sys.ora_debug_table (
   time DATE,
   txt0 CLOB,
   txt1 CLOB,
   txt2 CLOB,
   txt3 CLOB,
   txt4 CLOB,
   num0 NUMBER,
   num1 NUMBER,
   num2 NUMBER,
   num3 NUMBER,
   num4 NUMBER,
   num5 NUMBER,
   num6 NUMBER,
   num7 NUMBER,
   num8 NUMBER,
   num9 NUMBER
);

CREATE PUBLIC SYNONYM ora_debug_table FOR sys.ora_debug_table;
GRANT SELECT, INSERT,DELETE ON  sys.ora_debug_table TO public;

But it is not a real/good solution for this case. First of all, I think it is a bug and I am going to open SR for that, in addition it is not good idea to create a table for supporting optimizer even we do not know the nature of the table. There are several temporary tables under the SYS schema, like FINALHIST$ – this is used when creating histograms to keep transient data. Therefore it gives me an idea to create this table. But, it can cause a problem when applying of PSU`s, other (optimizer) patches or in future upgrades due to existing such a table.
In such situations, the supported way is extracting optimizer hints from the SPB and creating a SQL Patch for the statement. Then you can disable SPB until the problem is solved.

declare
  sqltext clob;
begin
  select sql_fulltext into sqltext from v$sqlarea where sql_id = 'anb0kap085nf8' and rownum = 1;
  sys.dbms_sqldiag_internal.i_create_patch(sql_text=>sqltext,
  hint_text=>
     'FACTORIZE_JOIN(@"SET$1"("B"@"SEL$2" "B"@"SEL$3"))
      INDEX(@"SEL$04170C39" "B"@"SEL$2" ("TD"."BR" "TD"."DC"))
      LEADING(@"SEL$04170C39" "VW_JF_SET$5F0EEC53"@"SEL$55E0DA61" "B"@"SEL$2")
      USE_NL(@"SEL$04170C39" "B"@"SEL$2")
      NLJ_BATCHING(@"SEL$04170C39" "B"@"SEL$2")
      INDEX_RS_ASC(@"SEL$C7139F67" "A"@"SEL$2" ("TE"."DBT" "TE"."BR"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$C7139F67" "A"@"SEL$2")
      INDEX_RS_ASC(@"SEL$7E1B4AF5" "A"@"SEL$3" ("TE"."CRD" "TE"."BR"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$7E1B4AF5" "A"@"SEL$3")',
  name=>'sqlp_anb0kap085nf8');
end;
/
       
DECLARE
   i   INTEGER;
BEGIN
   i               :=
      DBMS_SPM.ALTER_SQL_PLAN_BASELINE (sql_handle =>'SQL_74d0023a19b45cf8',
                                        plan_name =>'SQL_PLAN_6dyq1ws7kyb2v1d40d4bc',
                                        attribute_name => 'ENABLED',
                                        attribute_value => 'NO'
      );

   DBMS_OUTPUT.put_line (i);
END;

The last question is why the SPB had been used for several for months but just recently we have faced the error? My answer is: “If there X condition is true then need to be performed Y action”. So, I think some conditions were true for optimizer and in this case it wanted to do some other operations but the error occurred.

July 25, 2018

Group by optimization in Oracle Database 12c R2

Filed under: CBO,Transformation — Chinar Aliyev @ 6:05 am

In this quick post we will see a group by optimization in Oracle Database 12c R2. It is actually a group by elimination. If the column is primary key (or unique) and requested to find grouping the columns values then optimizer eliminates unnecessary GROUP BY operation.
Let`s see the following case:


CREATE TABLE t2
 as
            SELECT   level id,
                     rpad('desc',level) des,
                     round(DBMS_RANDOM.VALUE (1, 50)*level/225,4) amount      
              FROM   DUAL
CONNECT BY LEVEL <= 1000000;

alter table t2 add constraint pk_t2 primary key(id);

execute dbms_stats.gather_table_stats(null,'t2');

      SELECT id, COUNT ( * )
        FROM t2
      GROUP BY id;

I have set OFE to 12.1 and 12.2, executed the above query and posted execution statistics(provided by Real Time SQL monitoring) here.

1)
SQL Monitoring Report
SQL Text
------------------------------
select /*+ monitor OPTIMIZER_FEATURES_ENABLE('12.1.0.1') */ id,count(*) from t2 group by id

Global Information
------------------------------
...

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    0.90 |    0.30 |     0.60 |  2001 |   4085 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=3993855611)
=============================================================================================================================
| Id |       Operation        | Name  |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Activity | Activity Detail |
|    |                        |       | (Estim) |      | Active(s) | Active |       | (Actual) |   (%)    |   (# samples)   |
=============================================================================================================================
|  0 | SELECT STATEMENT       |       |         |      |         9 |     +0 |     1 |       1M |          |                 |
|  1 |   SORT GROUP BY NOSORT |       |      1M | 2104 |         9 |     +0 |     1 |       1M |          |                 |
|  2 |    INDEX FULL SCAN     | PK_T2 |      1M | 2104 |         9 |     +0 |     1 |       1M |          |                 |
=============================================================================================================================


2) SQL Monitoring Report

SQL Text
------------------------------
select /*+ monitor OPTIMIZER_FEATURES_ENABLE('12.2.0.1') */ id,count(*) from t2 group by id

Global Information
------------------------------
 ...


Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    0.37 |    0.08 |     0.29 |  2001 |   4098 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=566354312)
=============================================================================================================================
| Id |       Operation        | Name  |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Activity | Activity Detail |
|    |                        |       | (Estim) |      | Active(s) | Active |       | (Actual) |   (%)    |   (# samples)   |
=============================================================================================================================
|  0 | SELECT STATEMENT       |       |         |      |         9 |     +0 |     1 |       1M |          |                 |
|  1 |   INDEX FAST FULL SCAN | PK_T2 |      1M |  576 |         9 |     +0 |     1 |       1M |          |                 |
=============================================================================================================================

3) SQL Monitoring Report

SQL Text
------------------------------
select /*+ monitor OPTIMIZER_FEATURES_ENABLE('12.1.0.1') INDEX_FFS(@"SEL$1" "T2"@"SEL$1" ("T2"."ID")) */ id,count(*) from t2 group by id

...

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    1.75 |    0.89 |     0.86 |  2001 |   2103 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=1738727289)
======================================================================================================================================
| Id |        Operation        | Name  |  Rows   | Cost |   Time    | Start  | Execs |   Rows   |  Mem  | Activity | Activity Detail |
|    |                         |       | (Estim) |      | Active(s) | Active |       | (Actual) | (Max) |   (%)    |   (# samples)   |
======================================================================================================================================
|  0 | SELECT STATEMENT        |       |         |      |         8 |     +1 |     1 |       1M |     . |    50.00 | Cpu (1)         |
|  1 |   HASH GROUP BY         |       |      1M | 3662 |         9 |     +0 |     1 |       1M |  41MB |    50.00 | Cpu (1)         |
|  2 |    INDEX FAST FULL SCAN | PK_T2 |      1M |  576 |         1 |     +1 |     1 |       1M |     . |          |                 |
======================================================================================================================================

So, when we are using 12cR1 or OFE = 12.1 then the database uses INDEX FULL SCAN and follows up SORT GROUP BY NOSORT operation by default in order to execute the query. It is a single block/sequential reads over the index blocks and no sorting is required to group the data. As you see, database spent 0.9 second to complete the SQL, 0.3 second was spent for CPU and 0.6 second for other waits.

But, if we run the SQL in 12cR2 or OFE = 12.2 then the SORT GROUP BY operation disappeared and elapsed time dropped from 0.9 to 0.37 second also cpu time reduced from 0.3 to 0.08. The database has used INDEX FAST FULL SCAN operation instead so, multi block reads has been used. This is an improvement for the group by operation in this case.

In third SQL, I have used OFE = 12.1 and also forced optimizer to use IFFS and as it is seen, the elapsed time (also cpu time) increased significantly due to additional HASH GROUP BY operation.

Now, let`s look at the explain plan to do some notes here.

explain plan for
select
/*+
    OPTIMIZER_FEATURES_ENABLE('12.2.0.1') */
 id,count(*) from t2
group by id;

select * from table(dbms_xplan.display(null,null,'+outline'));

Plan hash value: 566354312                                                    
                                                                              
------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |  1000K|  4882K|   576   (2)| 00:00:01 |
|   1 |  INDEX FAST FULL SCAN| PK_T2 |  1000K|  4882K|   576   (2)| 00:00:01 |
------------------------------------------------------------------------------
                                                                              
Outline Data                                                                  
-------------                                                                 
                                                                              
  /*+                                                                         
      BEGIN_OUTLINE_DATA                                                      
      INDEX_FFS(@"SEL$9BB7A81A" "T2"@"SEL$1" ("T2"."ID"))                     
      OUTLINE(@"SEL$1")                                                       
      ELIM_GROUPBY(@"SEL$1")                                                  
      OUTLINE(@"SEL$47952E7A")                                                
      ELIM_GROUPBY(@"SEL$47952E7A")                                           
      OUTLINE_LEAF(@"SEL$9BB7A81A")                                           
      ALL_ROWS                                                                
      DB_VERSION('12.2.0.1')                                                  
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')                                   
      IGNORE_OPTIM_EMBEDDED_HINTS                                             
      END_OUTLINE_DATA                                                        
  */                

The outline data contains ELIM_GROUPBY and we see the following lines from CBO trace file as

QB before group-by removal:******* UNPARSED QUERY IS *******
SELECT "T2"."ID" "ID",COUNT(*) "COUNT(*)" FROM "SYS"."T2" "T2" GROUP BY "T2"."ID"
QB before group-by elimination:******* UNPARSED QUERY IS *******
SELECT "T2"."ID" "ID",COUNT(*) "COUNT(*)" FROM "SYS"."T2" "T2" GROUP BY "T2"."ID"
Registered qb: SEL$47952E7A 0x6bb73080 (ELIMINATION OF GROUP BY SEL$1; SEL$1)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$47952E7A nbfros=1 flg=0
    fro(0): flg=0 objn=30312 hint_alias="T2"@"SEL$1"

QB after group-by elimination:******* UNPARSED QUERY IS *******
SELECT "T2"."ID" "ID",1 "COUNT(*)" FROM "SYS"."T2" "T2"
Registered qb: SEL$9BB7A81A 0x6bb73080 (ELIMINATION OF GROUP BY SEL$47952E7A; SEL$47952E7A)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$9BB7A81A nbfros=1 flg=0
    fro(0): flg=0 objn=30312 hint_alias="T2"@"SEL$1"

QB after group-by removal:******* UNPARSED QUERY IS *******
SELECT "T2"."ID" "ID",1 "COUNT(*)" FROM "SYS"."T2" "T2"
DCL: Checking validity of group-by elimination SEL$9BB7A81A (#1)
DCL: Result of group-by elimination: Invalid
OJE: Begin: find best directive for query block SEL$9BB7A81A (#1)
OJE: End: finding best directive for query block SEL$9BB7A81A (#1)
PM: Considering predicate move-around in query block SEL$9BB7A81A (#1)

.....................................................................
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "T2"."ID" "ID",1 "COUNT(*)" FROM "SYS"."T2" "T2"
kkoqbc: optimizing query block SEL$9BB7A81A (#1)

So, the original query has been transformed into to SELECT “T2″.”ID” “ID”, 1 “COUNT(*)” FROM “SYS”.”T2″ “T2” by eliminating group by.

Note 1: Although there is a primary/unique constraint, Invisibility of the index prevents optimizer to eliminate group by operation.

Note 2: It is not enough there is a unique constraint on the grouping column, in addition the column must be declared as not null in order to be eliminated the group by operation.

Note 3: If there is a multi-column primary/unique key then the group by operation is not eliminated:

CREATE TABLE t3 (c1 NUMBER NOT NULL, c2 DATE NOT NULL, c3 NUMBER);

ALTER TABLE t3 ADD CONSTRAINT pk_t3 PRIMARY KEY (c1,c2);

BEGIN
    FOR i IN 1 .. 1000
    LOOP
        INSERT INTO t3 (c1, c2, c3)
          VALUES   (i, TRUNC (SYSDATE) + 1, i * 100);
    END LOOP;
END;

execute dbms_stats.gather_table_stats(null,'t3');

EXPLAIN PLAN
    FOR

          SELECT   c1, c2, COUNT ( * )
            FROM   t3
        GROUP BY   c1, c2;

SELECT   * FROM table (DBMS_XPLAN.display (NULL, NULL, '+outline'));

Plan hash value: 3048307469                                                   
                                                                              
------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |   708 |  8496 |     4   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT|       |   708 |  8496 |     4   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN    | PK_T3 |  1000 | 12000 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------
                                                                              
Outline Data                                                                  
-------------                                                                 
                                                                              
  /*+                                                                         
      BEGIN_OUTLINE_DATA                                                      
      INDEX(@"SEL$1" "T3"@"SEL$1" ("T3"."C1" "T3"."C2"))                      
      OUTLINE_LEAF(@"SEL$1")                                                  
      ALL_ROWS                                                                
      DB_VERSION('12.2.0.1')                                                  
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')                                   
      IGNORE_OPTIM_EMBEDDED_HINTS                                             
      END_OUTLINE_DATA                                                        
  */           

Subquery Elimination in Oracle Database 12c R2

Filed under: CBO,Transformation — Chinar Aliyev @ 6:04 am

First of all, I want to note that Franck Pachot wrote about the Subquery Elimination, but I am not going to repeat his opinions here, instead we are going to review different types of queries, will interpret how optimizer eliminate them and which conditions allow optimizer to perform this transformation.

The test case:


CREATE TABLE t1
 as
            SELECT   level id,
                     rpad('desc',level) des,
                     round(DBMS_RANDOM.VALUE (1, 50)*level/225,4) amount      
              FROM   DUAL
CONNECT BY LEVEL <= 1000;

execute dbms_stats.gather_table_stats(null,'t1');
EXPLAIN PLAN FOR

        SELECT   *
          FROM   t1 x
         WHERE   EXISTS (SELECT   1
                           FROM   t1 y
                          WHERE   x.id = y.id);
(Query1)

SELECT   * FROM table (DBMS_XPLAN.display);
                                             

Execution plan of the SQL in Oracle database 12 R1 is

Plan hash value: 1850585620                                                
                                                                           
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  1000 |  8000 |    48   (0)| 00:00:01 |
|*  1 |  HASH JOIN SEMI    |      |  1000 |  8000 |    48   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1   |  1000 |  4000 |    24   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   |  1000 |  4000 |    24   (0)| 00:00:01 |
---------------------------------------------------------------------------
                                                                           
Predicate Information (identified by operation id):                        
---------------------------------------------------                        
                                                                           
   1 - access("X"."ID"="Y"."ID")              

But execution plan for the same SQL in Oracle database 12 R2 is

Plan hash value: 3617692013                                               
                                                                          
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1000 |  4000 |    22   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  1000 |  4000 |    22   (0)| 00:00:01 |
--------------------------------------------------------------------------
                                                                          
Predicate Information (identified by operation id):                       
---------------------------------------------------                       
                                                                          
   1 - filter("X"."ID" IS NOT NULL)

As you see the subquery has been removed, so it is a new transformation in Oracle Database 12c R2. If you look at the predicate section of the plan produced in 12c R2 there is additional predicate (“X”.”ID” IS NOT NULL).
Why is there this condition?

According the (Query1) for each value of the outer table will always match at least one value from the inner table except the null column values. Because, the relational operator will return FALSE if any inputs of the operator is NULL. Therefore, to solve the threat for null values the (“X”.”ID” IS NOT NULL) predicate has been added. Generally, we can say that the subquery has been replaced with the filter predicate.
By logical, if we use NOT EXISTS subquery then (“X”.”ID” IS NULL) predicate will be added, due to not (“X”.”ID” IS NOT NULL) = (“X”.”ID” IS NULL) and we can see that in below execution plan.

EXPLAIN PLAN FOR

        SELECT   *
          FROM   t1 x
         WHERE  NOT EXISTS (SELECT   1
                           FROM   t1 y
                          WHERE   x.id = y.id);

SELECT   * FROM table (DBMS_XPLAN.display);

Plan hash value: 3617692013                                               
                                                                          
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   511 |    22   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   511 |    22   (0)| 00:00:01 |
--------------------------------------------------------------------------
                                                                          
Predicate Information (identified by operation id):                       
---------------------------------------------------                       
                                                                          
   1 - filter("X"."ID" IS NULL)    

There is only join (connecting) predicate in the subquery, let`s add additional filter predicate and see execution plan:

EXPLAIN PLAN
    FOR

        SELECT   x.id, x.amount
          FROM   t1 x
         WHERE   EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.id > 100);

SELECT   * FROM table (DBMS_XPLAN.display (NULL, NULL, NULL))

Plan hash value: 912003647                                                     
                                                                               
-------------------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |   901 | 20723 |    44   (0)| 00:00:01 |
|*  1 |  HASH JOIN SEMI     |         |   901 | 20723 |    44   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | T1      |  1000 | 10000 |    22   (0)| 00:00:01 |
|   3 |   VIEW              | VW_SQ_1 |   901 | 11713 |    22   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T1      |   901 |  3604 |    22   (0)| 00:00:01 |
-------------------------------------------------------------------------------
                                                                               
Predicate Information (identified by operation id):                            
---------------------------------------------------                            
                                                                               
   1 - access("X"."ID"="ITEM_1")                                               
   4 - filter("Y"."ID">100)                                                    

It is seen that SQE(Subquery Elimination) has not been performed. Why CBO has not performed this transformation when we add additional predicate? What was going on?
Let`s look at the CBO trace file

SQE: Trying SQ elimination.
SQE: Trying SQ elimination.
SQE: Query valid for SQ elimination:******* UNPARSED QUERY IS *******
SELECT "X"."ID" "ID","X"."AMOUNT" "AMOUNT" FROM "SYS"."T1" "X" WHERE  EXISTS (SELECT NULL "NULL" FROM "SYS"."T1" "Y" WHERE "X"."ID"="Y"."ID" AND "Y"."ID">100)
SQE: Query after SQ elimination:******* UNPARSED QUERY IS *******
SELECT "X"."ID" "ID","X"."AMOUNT" "AMOUNT" FROM "SYS"."T1" "X" WHERE  EXISTS (SELECT NULL "NULL" FROM "SYS"."T1" "Y" WHERE "X"."ID"="Y"."ID" AND "Y"."ID">100)
kkqctdrvTD-start on query block SEL$1 (#0)

What does it mean? As you see optimizer tried to eliminate SQ but did not. Explanation is that the query is a candidate for SQE, but there are not enough conditions that allow optimizer to perform the desired transformation. We have additional the (y.id > 100) filter predicate in our SQ. It is a semi join and as a result we have got completely different query by adding the predicate. That is why, we cannot safely add this predicate to outer query and remove sunquery.
First of all, uniqueness could allow optimizer to perform the transformation in this case. So, let`s create unique index on T.ID column and check execution plan again.

CREATE UNIQUE INDEX idx_uk_id  ON t1 (id);
EXPLAIN PLAN
    FOR

        SELECT   x.id, x.amount
          FROM   t1 x
         WHERE   EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.id > 100);

SELECT   * FROM table (DBMS_XPLAN.display (NULL, NULL, NULL))

Plan hash value: 3617692013                                               
                                                                          
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   901 |  9010 |    22   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   901 |  9010 |    22   (0)| 00:00:01 |
--------------------------------------------------------------------------
                                                                          
Predicate Information (identified by operation id):                       
---------------------------------------------------                       
                                                                          
   1 - filter("X"."ID">100)   

EP2

As you see optimizer has removed sub query and from CBO trace file:

SQE: Trying SQ elimination.
SQE: Trying SQ elimination.
SQE: Query valid for SQ elimination:******* UNPARSED QUERY IS *******
SELECT "X"."ID" "ID","X"."AMOUNT" "AMOUNT" FROM "SYS"."T1" "X" WHERE  EXISTS (SELECT NULL "NULL" FROM "SYS"."T1" "Y" WHERE "X"."ID"="Y"."ID" AND "Y"."ID">100)
Registered qb: SEL$D0DB0F9A 0x6bab00c0 (SUBQUERY ELIMINATE SEL$1; SEL$2)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$D0DB0F9A nbfros=1 flg=0
    fro(0): flg=0 objn=30057 hint_alias="X"@"SEL$1"

SQE: Query after SQ elimination:******* UNPARSED QUERY IS *******
SELECT "X"."ID" "ID","X"."AMOUNT" "AMOUNT" FROM "SYS"."T1" "X" WHERE "X"."ID">100 AND "X"."ID" IS NOT NULL


Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "X"."ID" "ID","X"."AMOUNT" "AMOUNT" FROM "SYS"."T1" "X" WHERE "X"."ID">100 AND "X"."ID" IS NOT NULL

Although, the EP2 execution plan does not display predicate part fully, In fact, as trace file reports we have two conditions in our predicate section of the transformed SQL. Thus, if there is a unique index on the join columns then SQE will be performed. But what happens if this index is invisible?

alter index idx_uk_id invisible;

EXPLAIN PLAN
    FOR

        SELECT   x.id, x.amount
          FROM   t1 x
         WHERE   EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.id > 100);

SELECT   * FROM table (DBMS_XPLAN.display (NULL, NULL, NULL))

Plan hash value: 912003647                                                     
                                                                               
-------------------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |   901 | 20723 |    44   (0)| 00:00:01 |
|*  1 |  HASH JOIN SEMI     |         |   901 | 20723 |    44   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | T1      |  1000 | 10000 |    22   (0)| 00:00:01 |
|   3 |   VIEW              | VW_SQ_1 |   901 | 11713 |    22   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T1      |   901 |  3604 |    22   (0)| 00:00:01 |
-------------------------------------------------------------------------------
                                                                               
Predicate Information (identified by operation id):                            
---------------------------------------------------                            
                                                                               
   1 - access("X"."ID"="ITEM_1")                                               
   4 - filter("Y"."ID">100)       

Although, there is a unique index, but the index invisibility prevents optimizer to consider index during the optimization process.
But, if we have primary key on the join columns and we invisible the primary key index, will optimizer transform and eliminate SQ in this case?
Let`s see

drop index idx_uk_id;
alter table t1 add constraint pk_t1 primary key(id)
alter index pk_t1 invisible;
 
EXPLAIN PLAN
    FOR

        SELECT   x.id, x.amount
          FROM   t1 x
         WHERE   EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.id > 100);

SELECT   * FROM table (DBMS_XPLAN.display (NULL, NULL, NULL))

Plan hash value: 3617692013                                               
                                                                          
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   901 |  9010 |    22   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   901 |  9010 |    22   (0)| 00:00:01 |
--------------------------------------------------------------------------
                                                                          
Predicate Information (identified by operation id):                       
---------------------------------------------------                       
                                                                          
   1 - filter("X"."ID">100)                                               
                                                                          
Note                                                                      
-----                                                                     
   - rely constraint used for this statement                              

And from trace file:

SQE: Trying SQ elimination.
SQE: Trying SQ elimination.
SQE: Query valid for SQ elimination:******* UNPARSED QUERY IS *******
SELECT "X"."ID" "ID","X"."AMOUNT" "AMOUNT" FROM "SYS"."T1" "X" WHERE  EXISTS (SELECT NULL "NULL" FROM "SYS"."T1" "Y" WHERE "X"."ID"="Y"."ID" AND "Y"."ID">100)
Registered qb: SEL$D0DB0F9A 0x6b399108 (SUBQUERY ELIMINATE SEL$1; SEL$2)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$D0DB0F9A nbfros=1 flg=0
    fro(0): flg=0 objn=30057 hint_alias="X"@"SEL$1"

SQE: Query after SQ elimination:******* UNPARSED QUERY IS *******
SELECT "X"."ID" "ID","X"."AMOUNT" "AMOUNT" FROM "SYS"."T1" "X" WHERE "X"."ID">100
kkqctdrvTD-start on query block SEL$D0DB0F9A (#0)

Same thing also is true if we have unique constraint and even if unique index is invisible then optimizer still perform the transformation but will add COL IS NOT NULL as a second predicate.

It is shown that there is not “X”.”ID” IS NOT NULL condition due to having a primary key, it does not contain NULL values. As you see in note section of the execution plan that optimizer used the rely constraint in optimization phase and performed SQE.
In addition if NOT EXISTS SQ is used then we will see (“X”.”ID”<=100) predicate in the plan, due to NOT (“X”.”ID”>100) = (“X”.”ID”<=100).

Let`s see following query

alter index pk_t1 visible;
EXPLAIN PLAN
    FOR

        SELECT   x.id, x.amount
          FROM   t1 x
         WHERE  not EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.amount > 100);

SELECT   * FROM table (DBMS_XPLAN.display (NULL, NULL, NULL))
Plan hash value: 3617692013                                               
                                                                          
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   474 |  4740 |    22   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   474 |  4740 |    22   (0)| 00:00:01 |
--------------------------------------------------------------------------
                                                                          
Predicate Information (identified by operation id):                       
---------------------------------------------------                       
                                                                          
   1 - filter(LNNVL("X"."AMOUNT">100))                                    

As you see, the threat for null values is solved by applying LNNVL function!
Now let`s see following query

ALTER TABLE t1 ADD  amount2 number;

UPDATE   t1
   SET   amount2 = amount;

COMMIT;

EXPLAIN PLAN
    FOR

        SELECT   x.id, x.amount
          FROM   t1 x
         WHERE   (EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.amount > 100)
                 or 
                 NOT EXISTS (SELECT   NULL
                           FROM   t1 z
                          WHERE   x.id = z.id AND z.amount2 > 150 and z.amount2 <=200))
                 and 
                  EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.amount <= 1000)
                 ;
                          

SELECT   * FROM table (DBMS_XPLAN.display (NULL, NULL, '+OUTLINE'))


Plan hash value: 3617692013                                               
                                                                          
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   999 | 22977 |    22   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   999 | 22977 |    22   (0)| 00:00:01 |
--------------------------------------------------------------------------
                                                                          
Outline Data                                                              
-------------                                                             
                                                                          
  /*+                                                                     
      BEGIN_OUTLINE_DATA                                                  
      FULL(@"SEL$1450B833" "X"@"SEL$1")                                   
      OUTLINE(@"SEL$4")                                                   
      OUTLINE(@"SEL$3")                                                   
      OUTLINE(@"SEL$2")                                                   
      OUTLINE(@"SEL$1")                                                   
      ELIMINATE_SQ(@"SEL$4")                                              
      ELIMINATE_SQ(@"SEL$3")                                              
      ELIMINATE_SQ(@"SEL$2")                                              
      OUTLINE_LEAF(@"SEL$1450B833")                                       
      ALL_ROWS                                                            
      DB_VERSION('12.2.0.1')                                              
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')                               
      IGNORE_OPTIM_EMBEDDED_HINTS                                         
      END_OUTLINE_DATA                                                    
  */                                                                      
                                                                          
Predicate Information (identified by operation id):                       
---------------------------------------------------                       
                                                                          
   1 - filter("X"."AMOUNT"<=1000 AND ("X"."AMOUNT">100 OR                 
              LNNVL("X"."AMOUNT2">150) OR LNNVL("X"."AMOUNT2"<=200)))

In this case we have three subqueries with disjunction and conjunction. Optimizer has eliminated them, also has constructed a special predicate from the SQ conditions. If SQ are not eliminated then we get following execution plan

EXPLAIN PLAN
    FOR

        SELECT 
         /*+ 
           NO_ELIMINATE_SQ(@"SEL$4")                                              
           NO_ELIMINATE_SQ(@"SEL$3")                                              
           NO_ELIMINATE_SQ(@"SEL$2")                                              
         */
          x.id, x.amount
          FROM   t1 x
         WHERE   (EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.amount > 100)
                 or 
                 NOT EXISTS (SELECT   NULL
                           FROM   t1 z
                          WHERE   x.id = z.id AND z.amount2 > 150 and z.amount2 <=200))
                 and 
                  EXISTS (SELECT   NULL
                           FROM   t1 y
                          WHERE   x.id = y.id AND y.amount <= 1000)
                 ;
                          

SELECT   * FROM table (DBMS_XPLAN.display(NULL, NULL, ' +OUTLINE')); 
Plan hash value: 914341523                                                                  
                                                                                            
--------------------------------------------------------------------------------------      
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |      
--------------------------------------------------------------------------------------      
|   0 | SELECT STATEMENT             |       |     1 |    10 |  1023   (1)| 00:00:01 |      
|*  1 |  FILTER                      |       |       |       |            |          |      
|*  2 |   TABLE ACCESS FULL          | T1    |  1000 | 10000 |    22   (0)| 00:00:01 |      
|*  3 |   TABLE ACCESS BY INDEX ROWID| T1    |     1 |    10 |     2   (0)| 00:00:01 |      
|*  4 |    INDEX UNIQUE SCAN         | PK_T1 |     1 |       |     1   (0)| 00:00:01 |      
|*  5 |   TABLE ACCESS BY INDEX ROWID| T1    |     1 |    17 |     2   (0)| 00:00:01 |      
|*  6 |    INDEX UNIQUE SCAN         | PK_T1 |     1 |       |     1   (0)| 00:00:01 |      
--------------------------------------------------------------------------------------      
                                                                                            
Outline Data                                                                                
-------------                                                                               
                                                                                            
  /*+                                                                                       
      BEGIN_OUTLINE_DATA                                                                    
      INDEX_RS_ASC(@"SEL$2" "Y"@"SEL$2" ("T1"."ID"))                                        
      INDEX_RS_ASC(@"SEL$3" "Z"@"SEL$3" ("T1"."ID"))                                        
      PQ_FILTER(@"SEL$CF29FC89" SERIAL)                                                     
      ORDER_SUBQ(@"SEL$CF29FC89" "SEL$2" "SEL$3")                                           
      FULL(@"SEL$CF29FC89" "X"@"SEL$1")                                                     
      OUTLINE(@"SEL$4")                                                                     
      OUTLINE(@"SEL$1")                                                                     
      UNNEST(@"SEL$4")                                                                      
      OUTLINE(@"SEL$6590669A")                                                              
      ELIMINATE_JOIN(@"SEL$6590669A" "Y"@"SEL$4")                                           
      OUTLINE_LEAF(@"SEL$CF29FC89")                                                         
      OUTLINE_LEAF(@"SEL$3")                                                                
      OUTLINE_LEAF(@"SEL$2")                                                                
      ALL_ROWS                                                                              
      DB_VERSION('12.2.0.1')                                                                
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')                                                 
      IGNORE_OPTIM_EMBEDDED_HINTS                                                           
      END_OUTLINE_DATA                                                                      
  */                                                                                        
                                                                                            
Predicate Information (identified by operation id):                                         
---------------------------------------------------                                         
                                                                                            
   1 - filter( EXISTS (SELECT /*+ NO_ELIMINATE_SQ */ 0 FROM "T1" "Y" WHERE                  
              "Y"."ID"=:B1 AND "Y"."AMOUNT">100) OR  NOT EXISTS (SELECT /*+ NO_ELIMINATE_SQ 
              */ 0 FROM "T1" "Z" WHERE "Z"."ID"=:B2 AND "Z"."AMOUNT2">150 AND               
              "Z"."AMOUNT2"<=200))                                                          
   2 - filter("AMOUNT"<=1000)                                                               
   3 - filter("Y"."AMOUNT">100)                                                             
   4 - access("Y"."ID"=:B1)                                                                 
   5 - filter("Z"."AMOUNT2">150 AND "Z"."AMOUNT2"<=200)                                     
   6 - access("Z"."ID"=:B1)

Thus, cost of the query is reduced significantly by eliminating SQ.

July 11, 2018

Library Cache Pin wait and distributed queries

Filed under: ASH,Bug,Tuning — Chinar Aliyev @ 5:53 am

In this blog post I am going to discuss about an issue I faced last week. The situation was that, in a specific time period application users could not connect to the database and I was asked to check database side to see there are any symptoms. I have historical ASH data and going to use it to describe the problem and its solution. Environment: OS – Linux x86 64-bit, Oracle DB – 12.1.0.2.0, CPU Count = 4.
First of all, we need to understand of database workload, session’s activities within the database. The following SQL statement shows us what kind of workload the database had.

SELECT event,
       ROUND (cnt * 100 / (SUM (cnt) OVER ()), 2) "%",
       ROUND (SUM (cnt) OVER () / 6, 2) "DB Time in Minute",
       ROUND (SUM (cnt) OVER () / 6 /20,2) "AAS",
       (select value from v$parameter where name='cpu_count') "Cpu Count"
  FROM (SELECT NVL (event, 'CPU') event, COUNT ( * ) cnt
          FROM dba_hist_active_sess_history
         WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                            'dd-mm-yyyy hh24:mi:ss'
                                   )
                               AND  TO_DATE ('04-07-2018 18:10:00',
                                             'dd-mm-yyyy hh24:mi:ss'
                                    )
        GROUP BY event
        ORDER BY 2 DESC);

EVENT                                %         DB Time in Minute   AAS     Cpu Count
------------------------------------ -------   -----------------   -----   ---------
library cache pin                    99.47     5320                266     4
SQL*Net message from dblink            .47     5320                266     4
CPU                                    .02     5320                266     4
SQL*Net more data from dblink          .01     5320                266     4
ksdxexeotherwait                       .01     5320                266     4
enq: TX - row lock contention          .01     5320                266     4
db file scattered read                   0     5320                266     4
db file async I/O submit                 0     5320                266     4

What do the numbers tell us? As it is seen that 99.5% of DB Time was consumed by library cache pin event. Also, we have twenty minute time interval and DB time is 5320 minute, therefore average active sessions is equal to:

 AAS = DB TIME / Time Interval = 5320/20=266

But, the system has only 4 CPU, how is it possible? Was CPU bottleneck there? Because AAS dramatically greater than number of CPUs.

AAS vs Cpu Count   266 vs 4 ?

Ok, do you think the Library cache pin event occurred due to insufficient CPU? Yes, it is possible but not in that case. We can clear this question by looking at operation system statistics.

SELECT distinct snap_id
          FROM dba_hist_active_sess_history
         WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                            'dd-mm-yyyy hh24:mi:ss'
                                   )
                               AND  TO_DATE ('04-07-2018 18:10:00',
                                             'dd-mm-yyyy hh24:mi:ss'
                                    );
   SNAP_ID
----------
     15144
     15145

And the following query gives us some information about the values of some OS statistics.

SELECT
           os1.stat_name,
           os1.VALUE prev_value,
           os2.VALUE cur_value,
           (os2.VALUE - os1.VALUE) VALUE
  FROM dba_hist_osstat os1, dba_hist_osstat os2
 WHERE     os1.snap_id = 15144
         AND os2.snap_id = 15145
         AND os1.stat_name = os2.stat_name
         AND os1.stat_name IN
        ('SYS_TIME', 'USER_TIME', 'BUSY_TIME', 'IDLE_TIME', 'LOAD','IOWAIT_TIME');

STAT_NAME        PREV_VALUE   CUR_VALUE     VALUE
------------     -----------  -----------   ------------
IDLE_TIME        3364835132   3366212913    1377781
BUSY_TIME        499756687    499814678      57991
USER_TIME        448589295    448637270      47975
SYS_TIME         47664441     47674199       9758
IOWAIT_TIME      74915285     74925120       9835
LOAD             .26953125    .139648438    -.12988281

And

busy BUSY_TIME/( BUSY_TIME + IDLE_TIME + IOWAIT_TIME) = 57991/(57991+1377781+9835)= 0.04 

It means, system was only busy 0.0004% in that time of interval and LOAD was only 0.14

As you see, in fact, OS statistics tell us the system in that time interval was not busy (load was 0.14 and the system has 4 CPU). But why AAS is significantly bigger than load average? (AAS>>LOAD and AAS>>CPU Count). Why is there significant difference between AAS and number of CPU?

Explanation is that the sessions within database actually considered as active but in OS point of view they were not really active. Because, these sessions were busy by doing remote requests and they were waiting response from a remote database. During this waiting interval, actually the server processes (these session`s) in local OS were not doing something, instead just were waiting, but in the DB side their status were active, since the SQL*Net more data from dblink event tells us that database sessions was trying to perform remote requests and were waiting for something from the remote database. Let`s see how DB Time was distributed over the SQL statements.

SELECT sql_id, sample, ROUND (sample * 100 / SUM (sample) OVER (), 2) "%"
      FROM (SELECT sql_id, COUNT ( * ) sample
              FROM dba_hist_active_sess_history
             WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                                'dd-mm-yyyy hh24:mi:ss'
                                       )
                                   AND  TO_DATE ('04-07-2018 18:10:00',
                                                 'dd-mm-yyyy hh24:mi:ss'
                                        )
                  AND event = 'library cache pin'
           GROUP BY sql_id
           ORDER BY 2 DESC);

SQL_ID            SAMPLE          %
------------- ---------- ----------
akb5ynxbfcgta      29700      93.54
fnsxa2h8r0pf6       1339       4.22
4b791875tzuag        566       1.78
81pgzg0bxhqcq        108        .34
ddf9f04xgcb95         39        .12

What is this SQL with ID akb5ynxbfcgta? It is a simple PL/SQL block to generate unique id when application tries to connect to the database

BEGIN
   :1   := USR.api_pr.generate_id (:2);
END;

(STMT 1)

Sessions were waiting for Library cache pin event for this SQL mostly

SELECT event, COUNT ( * )
  FROM dba_hist_active_sess_history
 WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                    'dd-mm-yyyy hh24:mi:ss'
                           )
                       AND  TO_DATE ('04-07-2018 18:10:00',
                                     'dd-mm-yyyy hh24:mi:ss'
                            )
       AND sql_id = 'akb5ynxbfcgta'
GROUP BY event
ORDER BY 2 DESC;

EVENT                   COUNT(*)
------------------      ---------
library cache pin       29700
ksdxexeotherwait        4

So, now what should we do for the next step? Fortunately, DBA_HIST_ACTIVE_SESS_HISTORY (V$ACTIVE_SESSION_HISTORY) view has columns about blocking session and even can describe wait event chain. The following SQL statement gives us information about the blocking sessions.

SELECT blocking_session, blocking_session_serial#, COUNT ( * )
  FROM dba_hist_active_sess_history
 WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                    'dd-mm-yyyy hh24:mi:ss'
                           )
                       AND  TO_DATE ('04-07-2018 18:10:00',
                                     'dd-mm-yyyy hh24:mi:ss'
                            )
       AND sql_id = 'akb5ynxbfcgta'
GROUP BY blocking_session, blocking_session_serial#
ORDER BY 3 DESC;

BLOCKING_SESSION BLOCKING_SESSION_SERIAL#   COUNT(*)
---------------- ------------------------ ----------
              59                    46957      29594
             146                    38992        105
                                                   4
             420                    31202          1

The session with sid=59, serial#=46957 is the most blocking session and what was it doing in that time interval?

SELECT NVL (event, 'CPU') event, COUNT ( * )
  FROM dba_hist_active_sess_history
 WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                    'dd-mm-yyyy hh24:mi:ss'
                           )
                       AND  TO_DATE ('04-07-2018 18:10:00',
                                     'dd-mm-yyyy hh24:mi:ss'
                            )
       AND session_id = 59
       AND session_serial# = 46957
GROUP BY event;

EVENT                     COUNT(*)
-----------------------   --------
library cache pin         106
ksdxexeotherwait          2

As you see this session also was waiting for Library cache pin event, then who was the blocker of this session?

SELECT blocking_session, blocking_session_serial#, COUNT ( * )
  FROM dba_hist_active_sess_history
 WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                    'dd-mm-yyyy hh24:mi:ss'
                           )
                       AND  TO_DATE ('04-07-2018 18:10:00',
                                     'dd-mm-yyyy hh24:mi:ss'
                            )
       AND session_id = 59
       AND session_serial# = 46957
GROUP BY blocking_session, blocking_session_serial#
ORDER BY 3 DESC;

BLOCKING_SESSION BLOCKING_SESSION_SERIAL#   COUNT(*)
---------------- ------------------------ ----------
             146                    38992        105
                                                   2
             420                    31202          1

Ok, now, what was the session with sid=146 doing?

SELECT NVL (event, 'CPU') event, COUNT ( * )
  FROM dba_hist_active_sess_history
 WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                    'dd-mm-yyyy hh24:mi:ss'
                           )
                       AND  TO_DATE ('04-07-2018 18:10:00',
                                     'dd-mm-yyyy hh24:mi:ss'
                            )
       AND session_id = 146
       AND session_serial# = 38992
GROUP BY event
ORDER BY 2 DESC;

EVENT                                COUNT(*)
------------------------------------ ----------
SQL*Net message from dblink          119

Interesting! the session was waiting for a response from a remote database! Let’s find its SQL that was executing.

SELECT sql_id, COUNT ( * )
  FROM dba_hist_active_sess_history
 WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                    'dd-mm-yyyy hh24:mi:ss'
                           )
                       AND  TO_DATE ('04-07-2018 18:10:00',
                                     'dd-mm-yyyy hh24:mi:ss'
                            )
       AND session_id = 146
       AND session_serial# = 38992
GROUP BY sql_id
ORDER BY 2 DESC;

SQL_ID          COUNT(*)
------------- ----------
6w1wcftvt2ajx        119

I have found the SQL with sql_id=’6w1wcftvt2ajx’ and it is something like

SELECT ROWNUM,
       CASE
          WHEN ROWNUM = 1 AND :B8 = '1'
          THEN
             TO_CHAR(F1@TODB1 (:B4,
                                                            TO_CHAR (TO_DATE (:B3,
                                                                              'DD.MM.YYYY'
                                                                     )
                                                                     - 1,
                                                                     'DDMMYYYY'
                                                            )
                     ))
          ELSE
             NULL
       END
          ST,
       CASE
          WHEN ROWNUM = 1 AND :B8 = '1'
          THEN
             TO_CHAR(F1@TODB1 (:B4,
                                                            TO_CHAR (TO_DATE (:B2,
                                                                              'DD.MM.YYYY'
                                                                     ),
                                                                     'DDMMYYYY'
                                                            )
                     ))
          ELSE
             NULL
       END
          EN,
       A.*
  FROM (SELECT  ....
          FROM (SELECT ......
                       TO_CHAR (NVL (c, (SELECT c1
                                                      FROM ta@TODB1
                                                     WHERE ac = :B6)
                                )
                       )
                          ...,
                       NVL ( (SELECT TRIM (T.c1)
                                FROM tc@TODB1 T
                               WHERE T.MCCID = TO_CHAR (TR.TERMSIC)),
                            (SELECT TRIM (T.c1)
                               FROM tc@TODB1 T
                              WHERE T.id = :B7)
                       )
                          catid,....,
                       COALESCE (COALESCE (...,
                                           (SELECT c1
                                              FROM R@TODB2 
                                             WHERE TO_CHAR (TEX.EXCHANGEDATE,
                                                            'dd.mm.yyyy'
                                                   ) =
                                                      TO_CHAR ( (SELECT MAX(D)
                                                                   FROM R@TODB2  
                                                                  WHERE D <
                                                                           TR.D),
                                                               'dd.mm.yyyy'
                                                      )
                                                   AND ....)
                                           * TR.A
                                 ),
                                 TR.A
                       )
                          NA
                  FROM       C@TODB2 TR
                          LEFT JOIN
                             DA@TODB2 DIC_ACCT
                          ON ...
                       LEFT JOIN
                         R@TODB2 TEX
                       ON ...
                 WHERE ..

(STMT2)

As you see, in the SQL there are two database links. Firstly, Why this statement was waiting for SQL*Net message from dblink event?

Let`s see some information that ASH provides us.

SELECT sql_plan_line_id, sql_exec_id, sql_exec_start
  FROM dba_hist_active_sess_history
 WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                    'dd-mm-yyyy hh24:mi:ss'
                           )
                       AND  TO_DATE ('04-07-2018 18:10:00',
                                     'dd-mm-yyyy hh24:mi:ss'
                            )
       AND sql_id = '6w1wcftvt2ajx'
       AND session_id = 146
       AND session_serial# = 38992
       AND ROWNUM<=10;
  
SQL_PLAN_LINE_ID SQL_EXEC_ID SQL_EXEC_START
---------------- ----------- --------------------
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41
                    20717997 29-jun-2018 18:04:41 

Interesting, there is no sql_pline_line_id (for all samples SQL_PLAN_LINE_ID is null) , also the SQL was started to execute several days before! Then why this SQL was stuck and had not completed yet?

Just want to clear that, If SQL_PLAN_LINE_ID is null and the session was waiting for “SQL*Net message from dblink ” event it means session send a remote request and was waiting for a response from the remote DB and it had not start to execute local part of SQL statement, that is why SQL_PLAN_LINE_ID was not recorded. So the session was spending its whole time for waiting the response from remote DB.

The SQL contains two database link, when I checked DB1 database, I did not find any problem there, but in DB2 (this DB is a cluster database) I found something interesting.

Firstly, I realized user id that uses by my local database to connect to the remote database (DB2) and in remote database. For that user id ASH data returned me the following information

SELECT event, session_id, session_serial#, COUNT ( * )
  FROM dba_hist_active_sess_history
 WHERE sample_time BETWEEN TO_DATE ('04-07-2018 17:50:00',
                                    'dd-mm-yyyy hh24:mi:ss'
                           )
                       AND  TO_DATE ('04-07-2018 18:10:00',
                                     'dd-mm-yyyy hh24:mi:ss'
                            )
       AND user_id = 242
GROUP BY session_id, event, session_serial#
ORDER BY 4;

EVENT                           SESSION_ID    SESSION_SERIAL#   COUNT(*)
----------------------------    ------------  ---------------   --------- 
inactive transaction branch     829           37606             120

It means, during that period of time only one session was recorded (activities of other sessions were not captured due to inefficient/short time to take samples, other calls were not candidate for ASH).
This event is used to control a transaction branch, I did a search in MOS and found some bugs related with that event, one is :

Bug 21354456 – Query over dblink hanging waiting for “inactive transaction branch” (Doc ID 21354456.8)

I did not stop and immediately took call stack of the process:

SQL> oradebug setospid 81569
Oracle pid: 196, Unix process pid: 81569, image: .....
SQL> ORADEBUG TRACEFILE_NAME;
/u01/app/oracle/diag/rdbms/../../trace/.._ora_81569.trc
SQL> oradebug short_stack
ksedsts()+244<-ksdxfstk()+58<-ksdxcb()+918<-sspuser()+224<-__sighandler()<-semtimedop()+10<-skgpwwait()+200<-ksliwat()+2091<-kslwaitctx()+161<-kslwait()+140<-k2gInsert()+1214<-k2lbeg()+1369<-k2sbeg()+279<-k2serv()+557<-opiodr()+1165<-ttcpip()+2699<-opitsk()+1734<-opiino()+945<-opiodr()+1165<-opidrv()+587<-sou2o()+145<-opimai_real()+154<-ssthrdmain()+412<-main()+236

As the bug description says, if the session is waiting for inactive transaction branch event and call stack indicates the session in k2gInsert then it is this bug. Of course, I killed the session and library cache pin events disappeared, so application users were able to connect to the database. But to solve the mentioned problem completely I have to upgrade my Grid Infrastructure (RAC Software) to Oracle Grid/DB 12.2 version. So, have got additinal job and going to to do that 🙂 .

In this blog post I used the ASH data by manually executing SQL statements for extracting information to understand the performance problem. But there is another way that using Tanel Poder`s awesome dash_wait_chains.sql script. Let`s use it:

SQL>  @dash_wait_chains  event2 1=1 "timestamp'2018-07-04 17:50:00'" "timestamp'2018-07-04 18:10:00'"

%This     SECONDS        AAS WAIT_CHAIN
------ ---------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  95%      303620        253 -> library cache pin  -> library cache pin  -> SQL*Net message from dblink
   3%        8720        7.3 -> library cache pin  -> ksdxexeotherwait
   1%        4130        3.4 -> library cache pin  -> library cache pin  -> ksdxexeotherwait
   0%        1510        1.3 -> SQL*Net message from dblink
   0%        1050         .9 -> library cache pin  -> SQL*Net message from dblink
   0%          50          0 -> ON CPU
   0%          40          0 -> ksdxexeotherwait
   0%          40          0 -> SQL*Net more data from dblink
   0%          20          0 -> enq: TX - row lock contention  -> [idle blocker 1,154,55777]
   0%          10          0 -> db file async I/O submit
   0%          10          0 -> db file scattered read

11 rows selected.

As you see most of DB time consumed by library cache pin event and the session’s blocker(s) was waiting for response(s) from the remote database. We can see the blocker(s) with following script:

SQL>  @dash_wait_chains  session_id||':'||session_serial#||'-'||event2 1=1 "timestamp'2018-07-04 17:50:00'" "timestamp'2018-07-04 18:10:00'"

%This     SECONDS        AAS WAIT_CHAIN
------ ---------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   0%        1190          1 -> 146:38992-SQL*Net message from dblink
   0%        1050         .9 -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 94:51427-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 105:36467-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 904:46554-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 1254:40336-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 841:48060-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 104:39021-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 908:11015-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 901:44142-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 139:54040-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 843:58381-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 868:13714-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 866:18708-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 1191:57205-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 1290:31446-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 818:18989-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1030         .9 -> 852:36860-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 127:60895-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 1186:63072-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 435:8431-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 136:44975-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 108:23692-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 788:65523-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 76:52180-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 1167:2340-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 21:38451-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 909:33920-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 412:22964-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink
   0%        1020         .8 -> 44:45628-library cache pin  -> 59:46957-library cache pin  -> 146:38992-SQL*Net message from dblink

30 rows selected.

It is clearly see the blocker is the session with sid=59 and serial#=46957 as we identified it above manually.

Final note:

What is the relationship between library cache pin and SQL*Net message from dblink in this case?

When I look at the code(source of USR.api_pr.generate_id) of (STMT1) and saw that the statement itself also uses the same database link (and same objects) to retrieve data (also tries to perform some remote DML operations) from the remote database. During the execution (call), library cache pins are obtained on the library cache data heaps in order to read the data heaps and to prevent modifications of dependent objects (but in exclusive mode to modify data heaps). Pinning objects requires the heaps to be loaded into the memory and sessions must obtain pins on the appropriate heaps. This event indicates that other session keeps that pin in an incompatible mode. As a result, we see that all of these queries are referenced to the remote database and that is why the problem took place.

Summary

In this blog post we saw how to use ASH data in order to interpret a performance problem. We saw that AAS indicates database workload directly, although it can be dramatically bigger than the number of CPUs that system has. So, if AAS>> num CPU, in this case it does not always say that at OS level we have a bottleneck (cpu). Because, sessions could wait for responses from a remote database and in fact, their server processes may be doing nothing in local system.

June 25, 2018

Join cardinality, filter predicates and bitmap join indexes

Filed under: Cardinality Estimation,CBO,Tuning — Chinar Aliyev @ 11:25 am

A several weeks ago Jonathan Lewis wrote an article about a join cardinality estimation problem and how bitmap join index (BJI) could have helped to solve it. In this blog article we are going to review and interpret the case. We also will see the magic that stored behind the join selectivity and how filter predicates influence join size estimation. In addition, you will be given another possible solutions for the mentioned problem. Moreover, we will see that BJI actually helps (influences) optimizer to produce better cardinality estimation. Finally, I will try to explain the possible answers why Oracle Database does not collect statistics for the bitmap join index`s virtual column and does not use it even you set them via dbms_stats package.
The test case: Oracle Database Version is 12.2.0.1 .

drop table t1;
drop table dim_table;
 
create table dim_table (type_code number, object_type varchar2(10));
 
insert into dim_table values (1,'TABLE');
insert into dim_table values (2,'INDEX');
insert into dim_table values (3,'VIEW');
insert into dim_table values (4,'SYNONYM');
insert into dim_table values (5,'OTHER');
 
alter table dim_table add constraint dim_table_pk primary key (type_code) using index;
 
exec dbms_stats.gather_table_stats(user,'dim_table',cascade=>true);
 
create table t1 
nologging
as 
select 
        object_id, object_name, 
        decode(object_type, 'TABLE',1,'INDEX',2,'VIEW',3,'SYNONYM',4,5) type_code 
from 
        all_objects
where
        rownum <= 50000;
 
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
 
 
create  bitmap index t1_b1 on t1(dt.object_type)
from    t1, dim_table dt
where   t1.type_code = dt.type_code
;
 
exec dbms_stats.gather_table_stats(null, 't1', cascade=>true,method_opt=>'for all columns size 254');
select  t1.object_id
from
        t1, dim_table  dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE'
;
 
select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

                                                                                                                    
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |  14312 |00:00:00.11 |    2714 |       |       |          |
|*  1 |  HASH JOIN         |           |      1 |  80000 |  14312 |00:00:00.11 |    2714 |  2545K|  2545K|  739K (0)|
|*  2 |   TABLE ACCESS FULL| DIM_TABLE |      1 |      1 |      1 |00:00:00.01 |       3 |       |       |          |
|   3 |   TABLE ACCESS FULL| T1        |      1 |    400K|    400K|00:00:00.05 |    2708 |       |       |          |
---------------------------------------------------------------------------------------------------------------------
                                                                                                                     
Predicate Information (identified by operation id):                                                                  
---------------------------------------------------                                                                  
                                                                                                                     
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")                                                                     
   2 - filter("DT"."OBJECT_TYPE"='TABLE')

EP 1
                                    

As we see join cardinality has been estimated 80000 instead of 14312 and in this case we have significant difference between estimated and actual cardinalities. But if we remove the filter condition then we get following execution plan.

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |      1 |        |    400K|00:00:00.63 |    3478 |       |       |          |
|*  1 |  HASH JOIN         |              |      1 |    400K|    400K|00:00:00.63 |    3478 |  2545K|  2545K| 1035K (0)|
|   2 |   INDEX FULL SCAN  | DIM_TABLE_PK |      1 |      5 |      5 |00:00:00.01 |       1 |       |       |          |
|   3 |   TABLE ACCESS FULL| T1           |      1 |    400K|    400K|00:00:00.07 |    3477 |       |       |          |
------------------------------------------------------------------------------------------------------------------------
                                                                                                                        
Predicate Information (identified by operation id):                                                                     
---------------------------------------------------                                                                     
                                                                                                                        
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE") 
                                                                      
EP 2

So in this case we have got exact cardinality estimation. These two execution plans tell us, filter conditions influence the join size estimation directly, but we need to know how. Let`s see following three queries in order to understand the process of join size estimation.

SELECT   *
  FROM   t1, t2;

(Query1)

SELECT   *
  FROM   t1, t2
 WHERE   t1.c1 = t2.c2;

(Query2)

SELECT   *
  FROM   t1, t2
 WHERE   t1.c1 = t2.c2 AND t1.f1 = 'filter1' AND t2.f2 = 'filter2';

(Query3)

There Query1 is a Cartesian product and Query2 is an inner join without having any filters and I called it as “Pure Join”. Actually Query2 is the fraction or subset of Query1. To identify this fraction the concept of join selectivity has been introduced.

The definition: Join selectivity (JS) is the ratio of “pure join” cardinality over the size of Cartesian product.

So it means.

JS = CARD (Query2)/CARD (Query1)

It is the definition of JS and as you see, JS does not depend on filter predicates because there are not filter predicates. It is known that size of Cartesian product is the product of number of rows of the tables. So:

CARD (Query1) = num_rows(t1)*num_rows(t2) 

Then JS will be:

JS = CARD (Query2)/( num_rows(t1)*num_rows(t2)).  

Formula 1

There NUM_ROWS is the number of rows in the table.

In the dictionary we have object statistics (num_rows(t1), num_rows(t2) and others are known). It means, to calculate JS we need to calculate size of “pure join” – without any filter. Depending on which types of object statistics are provided, optimizer first calculates size of “pure join” then calculates JS and finally, it calculates cardinality for Query3 as following:

CARD (Query3) = JS*CARD (t1)*CARD (t2) = 
CARD (Query2)* CARD (t1)*CARD (t2)/( num_rows(t1)*num_rows(t2))                                                  

Formula 2 

There CARD (T1) and CARD (T2) are the table cardinalities after applying filter predicates in the Query3 (t1.f1 = ‘filter1’ AND t2.f2 = ‘filter2’).

We are trying to understand how optimizer calculates cardinality for Query3. As it can be seen from the Formula 2, JS is not depending on the filter predicates by definition. It only depends on join columns. That is why when we add filters to a query, we see wrong estimation.

When estimating cardinality of Query2 optimizer considers all appropriate join column`s values that provided by histograms (if they are available, otherwise it uses number of distinct values). But for Query3 optimizer does not have a chance (at least in current implementation of Oracle Database Optimizer) to know about the join columns values ranges after applying filter conditions. This is the reason to cause wrong cardinality estimation . It just uses Formula2 for estimation process and it means “average” JS is used for join size estimation.

Let`s see following SQL and its execution plan, we have forced optimizer to use index (bitmap) access path.

select 
        /*+ index(t1 t1_b1) */
        t1.object_id
from
        t1, dim_table dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE'
;
 
(Query4)

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')); 

Plan hash value: 1201576309                                                                            
                                                                                                       
-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |      1 |        |  14312 |00:00:00.06 |     445 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |      1 |  80000 |  14312 |00:00:00.06 |     445 |
|   2 |   BITMAP CONVERSION TO ROWIDS       |       |      1 |        |  14312 |00:00:00.02 |       4 |
|*  3 |    BITMAP INDEX SINGLE VALUE        | T1_B1 |      1 |        |      2 |00:00:00.01 |       4 |
-------------------------------------------------------------------------------------------------------
                                                                                                       
Predicate Information (identified by operation id):                                                    
---------------------------------------------------                                                    
                                                                                                       
   3 - access("T1"."SYS_NC00004$"='TABLE')  

What does the execution plan tell us? Query4 is a join and estimated cardinality for the statement is 80000 rows again. But, is it join cardinality or single table cardinality? If you look at the execution plan we see that it is the single table cardinality not the join cardinality. If it is, what was the calculated join cardinality for the Query4 then?

We can see the answer from following hinted SQL statement.

select 
        /*+ index(t1 t1_b1) NO_ELIMINATE_JOIN(@"SEL$1" "DT"@"SEL$1") */
        t1.object_id
from
        t1, dim_table dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE'
;
 
select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST +outline'));
---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |           |      1 |        |  14312 |00:00:00.11 |     448 |       |       |          |
|*  1 |  HASH JOIN                           |           |      1 |  16000 |  14312 |00:00:00.11 |     448 |  2545K|  2545K|  701K (0)|
|*  2 |   TABLE ACCESS FULL                  | DIM_TABLE |      1 |      1 |      1 |00:00:00.01 |       3 |       |       |          |
|   3 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1        |      1 |  80000 |  14312 |00:00:00.05 |     445 |       |       |          |
|   4 |    BITMAP CONVERSION TO ROWIDS       |           |      1 |        |  14312 |00:00:00.01 |       4 |       |       |          |
|*  5 |     BITMAP INDEX SINGLE VALUE        | T1_B1     |      1 |        |      2 |00:00:00.01 |       4 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                       
Predicate Information (identified by operation id):                                                                                    
---------------------------------------------------                                                                                    
                                                                                                                                       
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")                                                                                       
   2 - filter("DT"."OBJECT_TYPE"='TABLE')                                                                                              
   5 - access("T1"."SYS_NC00004$"='TABLE')        

EP3 
  

It is can be seen in fact optimizer calculates join cardinality 16000 instead of 80000. It is not bad when comparing it with actual rows so 16000 vs 14312. In addition, even if database does not eliminate the join the execution plan EP3 is much better then EP1 , because when data was retrieved with EP1 we have 2714 buffer gets and but with EP3 we have 448. There is only extra buffer gets, because unnecessary scanning of dimension table.

Now let`s see optimizer trace file. Firstly, database tried to select best access path for the single tables.

SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for T1[T1] 
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Table: T1  Alias: T1
    Card: Original: 400000.000000  Rounded: 400000  Computed: 400000.000000  Non Adjusted: 400000.000000
  Scan IO  Cost (Disk) =   726.000000
  Scan CPU Cost (Disk) =   95042730.560000
  Total Scan IO  Cost  =   726.000000 (scan (Disk))
                       =   726.000000
  Total Scan CPU  Cost =   95042730.560000 (scan (Disk))
                       =   95042730.560000
  Access Path: TableScan
    Cost:  728.524203  Resp: 728.524203  Degree: 0
      Cost_io: 726.000000  Cost_cpu: 95042731
      Resp_io: 726.000000  Resp_cpu: 95042731
  ****** trying bitmap/domain indexes ******
 ****** Costing Index T1_B1
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_SCAN
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_FILTER
  Column (#2): OBJECT_TYPE(VARCHAR2)
    AvgLen: 7 NDV: 5 Nulls: 0 Density: 0.200000
  Estimated selectivity: 0.200000 , col: #2 
  Access Path: index (AllEqRange)
    Index: T1_B1
    resc_io: 3.000000  resc_cpu: 23414
    ix_sel: 0.200000  ix_sel_with_filters: 0.200000 
    Cost: 3.000622  Resp: 3.000622  Degree: 0
  Bitmap nodes:
    Used T1_B1
      Cost = 3.000622, sel = 0.200000
  Access path: Bitmap index - accepted
    Cost: 1281.132710 Cost_io: 1280.200000 Cost_cpu: 35118917.488000 Sel: 0.200000
    Not Believed to be index-only
  ****** finished trying bitmap/domain indexes ******
  Best:: AccessPath: IndexBitmap
         Cost: 1281.132710  Degree: 1  Resp: 1281.132710  Card: 400000.000000  Bytes: 0.000000
Card adjusted,  Rounded Card: 80000  Computed Card: 80000.000000

As you see optimizer was enforced to select INDEX ACCESS path and index distinct keys allowed optimizer to estimate single table cardinality and it was computed as 80000.
It is obvious that in DIM_TABLE there is uniform distribution according the filter (and also join) columns and cardinality computed as 1:

Access path analysis for DIM_TABLE
***************************************
SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for DIM_TABLE[DT] 
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE

 kkecdn: Single Table Predicate:"DT"."OBJECT_TYPE"='TABLE'
  Estimated selectivity: 0.200000 , col: #2 
  Table: DIM_TABLE  Alias: DT
    Card: Original: 5.000000  Rounded: 1  Computed: 1.000000  Non Adjusted: 1.000000
  Scan IO  Cost (Disk) =   2.000000
  Scan CPU Cost (Disk) =   7971.440000
  Cost of predicates:
    io = NOCOST, cpu = 50.000000, sel = 0.200000 flag = 2048  ("DT"."OBJECT_TYPE"='TABLE')
  Total Scan IO  Cost  =   2.000000 (scan (Disk))
                         + 0.000000 (io filter eval) (= 0.000000 (per row) * 5.000000 (#rows))
                       =   2.000000
  Total Scan CPU  Cost =   7971.440000 (scan (Disk))
                         + 250.000000 (cpu filter eval) (= 50.000000 (per row) * 5.000000 (#rows))
                       =   8221.440000
  Access Path: TableScan
    Cost:  2.000218  Resp: 2.000218  Degree: 0
      Cost_io: 2.000000  Cost_cpu: 8221
      Resp_io: 2.000000  Resp_cpu: 8221
  Best:: AccessPath: TableScan
         Cost: 2.000218  Degree: 1  Resp: 2.000218  Card: 1.000000  Bytes: 0.000000

Regardless of join method the join cardinality was computed 16000 as:

Now joining: T1[T1]#1
***************
NL Join
  Outer table: Card: 1.000000  Cost: 2.000218  Resp: 2.000218  Degree: 1  Bytes: 
Access path analysis for T1
  Scan IO  Cost (Disk) =   726.000000
  Scan CPU Cost (Disk) =   95042730.560000
  Cost of predicates:
    io = NOCOST, cpu = 50.000000, sel = 0.200000 flag = 2048  ("T1"."TYPE_CODE"="DT"."TYPE_CODE")
  Total Scan IO  Cost  =   726.000000 (scan (Disk))
                         + 0.000000 (io filter eval) (= 0.000000 (per row) * 400000.000000 (#rows))
                       =   726.000000
  Total Scan CPU  Cost =   95042730.560000 (scan (Disk))
                         + 20000000.000000 (cpu filter eval) (= 50.000000 (per row) * 400000.000000 (#rows))
                       =   115042730.560000
  Inner table: T1  Alias: T1
  Access Path: TableScan
    NL Join:  Cost: 731.055594  Resp: 731.055594  Degree: 1
      Cost_io: 728.000000  Cost_cpu: 115050952
      Resp_io: 728.000000  Resp_cpu: 115050952
  ****** trying bitmap/domain indexes ******
  ****** finished trying bitmap/domain indexes ******
  Scan IO  Cost (Disk) =   726.000000
  Scan CPU Cost (Disk) =   95042730.560000
  Cost of predicates:
    io = NOCOST, cpu = 50.000000, sel = 0.200000 flag = 2048  ("T1"."TYPE_CODE"="DT"."TYPE_CODE")
  Total Scan IO  Cost  =   726.000000 (scan (Disk))
                         + 0.000000 (io filter eval) (= 0.000000 (per row) * 400000.000000 (#rows))
                       =   726.000000
  Total Scan CPU  Cost =   95042730.560000 (scan (Disk))
                         + 20000000.000000 (cpu filter eval) (= 50.000000 (per row) * 400000.000000 (#rows))
                       =   115042730.560000
  Inner table: T1  Alias: T1
  Access Path: TableScan
    NL Join:  Cost: 731.055594  Resp: 731.055594  Degree: 1
      Cost_io: 728.000000  Cost_cpu: 115050952
      Resp_io: 728.000000  Resp_cpu: 115050952
  ****** trying bitmap/domain indexes ******
  ****** finished trying bitmap/domain indexes ******

  Best NL cost: 731.055594
          resc: 731.055594  resc_io: 728.000000  resc_cpu: 115050952
          resp: 731.055594  resp_io: 728.000000  resc_cpu: 115050952
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = JOIN

Join Card:  16000.000000 = outer (1.000000) * inner (80000.000000) * sel (0.200000)
Join Card - Rounded: 16000 Computed: 16000.000000 

But we have bitmap join index here and that is why star transformation was considered as

OST: STAR TRANSFORMATION PLANS
***************************************
Best join order so far: 1
  Marked for join-back elimination: "DT"@"SEL$1"
Join order[1]:  T1[T1]#0
***********************
Best so far:  Table#: 0  cost: 1281.132710  card: 80000.000000  bytes: 640000.000000
***********************
OST: oldrsc = 1283.361336, oldcst = 1283.361336, rsc = 1281.132710, cst = 1281.132710, minRatio = 0
Registered qb: SEL$83D525E5 0x6b0f06c8 (BITMAP JOIN INDEX JOINBACK ELIMINATION SEL$1; SEL$1; "DT"@"SEL$1")
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$83D525E5 nbfros=1 flg=0
    fro(0): flg=0 objn=144415 hint_alias="T1"@"SEL$1"

OST: ST plan accepted
*********************************
Number of join permutations tried: 3
*********************************

So trace file indicates that “join-back elimination” (or BITMAP JOIN INDEX JOINBACK ELIMINATION) has been performed to avoid unnecessary scanning dimension table. In this case Oracle database kept join “cardinality” as the fact table cardinality. Actually it is not the join cardinality but single table cardinality.

As a result we see that in fact database actually uses bitmap join index to calculate join cardinality, of course it cannot take advantage of column distribution in this case, due to lack of opportunities. But anyway, it helps optimizer to reduce the error percent of cardinality estimation.

The next question is that why optimizer does not use object statistics that we create for bitmap join index`s virtual column?
In fact this virtual column is not a real virtual column “T1″.”SYS_NC00004$” – see EP3 execution plan. This column is just a symbolic link to refer BJI and it is empty column.

SQL&gt; select count(*) from t1 where "SYS_NC00004$" is null;

  COUNT(*)
----------
    400000

SQL&gt;

That is why I think the access(“T1”.”SYS_NC00004$”=’TABLE’) filter predicate is also a symbolic filter predicate. Therefore, currently database does not gather statistics for this type of columns.

Then, how database can gather statistics for this virtual column? It seems it is hard to achieve. When we create other virtual columns then columns data is derived when scanning the table and database can analyze the data to create appropriate object statistics like histograms.

SQL>  select t1.*,"SYS_NC00004$" from t1 where rownum<=5;

 OBJECT_ID OBJECT_NAME     TYPE_CODE   SYS_NC00004$
---------- -------------   ----------  ----------
        16 TS$             1
        20 ICOL$           1
         8 C_FILE#_BLOCK#  5
        37 I_OBJ2          2
        22 USER$           1

SQL>

By logically, to display/extract values for the “SYS_NC00004$” column, we have to specify join but not a single table related statement (and the reason is that why we see NULLs for the column, there could be possibility that KEY VALUE of the index could be extracted from INDEX ENTRY, because ROWID`s of the fact table can be used with a “rowid mapping” function to find a specific INDEX KEY value, but is it necessary or not rellay ?:-), for reguliar virtual columns it does not work like that). When we create virtual columns, the data for column is derived from one table and we can see values for the column, but BJİ`s virtual column`s data is derived from a join. It supports the idea that “SYS_NC00004$” column`s statistics could be gathered from the BJI segment directly. As it is known that BJI is a collection of {(Key Value1,1010010…0), (Key Value2,0010000…0),… (Key Value5,1110110…1)} And in this case KEY VALUE and its frequency can be computed as {Key Value,sum(its bitmap representation to fact table)}. This information can help optimizer to estimate fact table cardinality efficiently.
Are there any other solutions?

Since, BJI is a precomputed join that allows database to eliminate join and retrieve data from fact table in an efficient way. Thus, it supports idea that materialized views also could be an option. We can create histogram and as we desired can get exact estimation.

CREATE MATERIALIZED VIEW mv1 compress
ENABLE QUERY REWRITE
AS
SELECT dt.object_type
FROM    t1, dim_table dt
WHERE   t1.type_code = dt.type_code;

EXEC DBMS_STATS.gather_table_stats(USER, 'mv1',method_opt=&gt;'for all columns size 254');

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |  14312 |00:00:00.04 |     593 |
|*  1 |  MAT_VIEW REWRITE ACCESS FULL| MV1  |      1 |  14312 |  14312 |00:00:00.04 |     593 |
-----------------------------------------------------------------------------------------------
                                                                                               
Predicate Information (identified by operation id):                                            
---------------------------------------------------                                            
                                                                                               
   1 - filter("MV1"."OBJECT_TYPE"='TABLE')

As you see MV gives us comparable performance result with BJI. We have got 593 buffer gets with MV and 445 buffer gets with BJI. It is obvious that BJI has better compression (i.e bitmap compression) but any way the performance of MV is reasonable.

What about dynamic sampling? Let`s use automatic dynamic sampling and see the result.

alter session set optimizer_dynamic_sampling=11;

select  /*+ no_rewrite*/
         dt.object_type
from
        t1, dim_table dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type in ('TABLE');
 
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |      1 |        |  14312 |00:00:00.11 |    2714 |       |       |          |
|*  1 |  HASH JOIN         |           |      1 |  19203 |  14312 |00:00:00.11 |    2714 |  1856K|  1856K|  616K (0)|
|*  2 |   TABLE ACCESS FULL| DIM_TABLE |      1 |      1 |      1 |00:00:00.01 |       3 |       |       |          |
|   3 |   TABLE ACCESS FULL| T1        |      1 |    400K|    400K|00:00:00.04 |    2708 |       |       |          |
---------------------------------------------------------------------------------------------------------------------
                                                                                                                     
Predicate Information (identified by operation id):                                                                  
---------------------------------------------------                                                                  
                                                                                                                     
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")                                                                     
   2 - filter("DT"."OBJECT_TYPE"='TABLE')                                                                            
                                                                                                                     
Note                                                                                                                 
-----                                                                                                                
   - dynamic statistics used: dynamic sampling (level=AUTO)                                                          
                                                                                                                     

So, dynamic sampling also helped optimizer to produce quite better cardinality estimation than object statistics provided in EP1.
If you want to get more information about join cardinality estimation then refer my article – presented at Hotsos 2016 :

https://www.scribd.com/document/369165079/Join-Cardinality-Estimation-Methods

Summary

We saw that BJI actually helps optimizer to produce good enough estimation, because it actually reduces data that is retrieved from fact table. Of course, in this case we cannot take advantage of column distribution, BJI reduces fact table data averagely (because it does not know value ranges of join columns after applying filter predicates) and it allows optimizer to correct/calulate join cardinality averagely. Also, we can use NO_ELIMINATE hint in order to control cardinality estimation, although the dimension table will be scanned unnecessarily but the performance will be better than that of without BJI (so, EP3 vs EP2). In addition, materialized views and automatic dynamic sampling can be useful in your cases when there are wrong join size estimation.

May 10, 2018

SQL Tuning approach and a Real life example

Filed under: Rewriting,Tuning — Chinar Aliyev @ 10:27 am

As in other optimization problems, SQL Tuning task (STT) also requires to find an “optimal path” (OP) to reach the goal. OP depends on a lot of factors like where are we right now (starting point – A), what are we given? (The initial conditions), where do we need to reach? or what result is required to attain (End point B)? Depending the initial conditions and the required result, to find OP could very sophisticate process.

To solve a STT, firstly, we need to understand business process that gives us an opportunity and increases tuning approaches. Also, we need to understand which optimization techniques exists, which of them were implemented in commercial DBMS and how we can take advantage of them. According to implemented techniques and initial conditions the query optimizer (QO) will generate the efficient execution plan (EP).

Recently, I have been asked to tune an SQL. Let`s see the following SQL statement. Oracle database version is 11.2.0.4 and OS is Red Hat EL 6.7. The original SQL is quite complicate, I have removed some parts of the SQL but its equivalent looks like:

SELECT *
  FROM (SELECT  
         Ah.TRN_NO
        ,Ah.TRN_DATE
        ...
        ,ROW_NUMBER() Over(ORDER BY Tb.TRN_DATE DESC) Rn
          FROM (SELECT Ah.TRN_NO
                      ,Ah.TRN_DATE
                      ...
                      FROM transaction_history AH
                      ,(SELECT t.BR
                              ,t.Ac_No
                              ,t.C
                          FROM Accounts t
                         WHERE t.Cust_No = 'SSSSS' – a given Customer number
                           AND t.A_St = 'A'
                           AND t.R_St = 'O'
                           AND t.Type = 'U') Ca
                 WHERE Ah.Event <> 'REVL'
                   AND Ah.Br = Ca.Br
                   AND Ah.Ac_No = Ca.Ac_No
                   AND Ah.c = Ca.c
               ) Tb) d
WHERE d.Rn <= 15
ORDER BY d.Trn_Date DESC;

There are two tables, ACCOUNTS table contains customer`s account numbers and other information. For every client there can be several accounts in that table. Transactions for each account are kept in TRANSACTION_HISTORY table. For each time to execute the query we are given customer`s number that uniquely identifies the customer.
The above SQL statement tells us that it was written to find last fifteen transactions of the given customers (to show last fifteen transactions when application is opened). Now let`s see the execution plan of the statement. Before execution the query, I have flushed OS and the database`s buffer cache to see the exact performance statistics/problem.

                                                                                                                                                      
--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name                   | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                        |      1 |        |     15 |00:07:14.03 |   83945 |  76419 |       |       |          |
|*  1 |  VIEW                           |                        |      1 |     12 |     15 |00:07:14.03 |   83945 |  76419 |       |       |          |
|*  2 |   WINDOW SORT PUSHED RANK       |                        |      1 |     12 |     16 |00:07:14.03 |   83945 |  76419 |   160K|   160K|  142K (0)|
|   3 |    NESTED LOOPS                 |                        |      1 |     12 |    121K|00:07:13.54 |   83945 |  76419 |       |       |          |
|   4 |     NESTED LOOPS                |                        |      1 |     14 |    121K|00:00:04.39 |    1004 |    775 |       |       |          |
|*  5 |      TABLE ACCESS BY INDEX ROWID| ACCOUNTS               |      1 |      1 |     85 |00:00:00.81 |      84 |     84 |       |       |          |
|*  6 |       INDEX RANGE SCAN          | IX_ACC                 |      1 |      1 |     88 |00:00:00.06 |       4 |      4 |       |       |          |
|*  7 |      INDEX RANGE SCAN           | IX_TR_HISTORY          |     85 |     14 |    121K|00:00:03.53 |     920 |    691 |       |       |          |
|*  8 |     TABLE ACCESS BY INDEX ROWID | TRANSACTION_HISTORY    |    121K|    113 |    121K|00:07:08.97 |   82941 |  75644 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                                        
Predicate Information (identified by operation id):                                                                                                     
---------------------------------------------------                                                                                                     
                                                                                                                                                        
   1 - filter("D"."RN"<=15)                                                                                                                             
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("AH"."TRN_DT") DESC )<=15)                                                                 
   5 - filter(("T"."TYPE"='U' AND "T"."A_STAT"='A' AND "T"."R_STAT"='O'))                                                               
   6 - access("T"."CUST_NO"='SSSSSS')                                                                                                                  
   7 - access("AH"."AC_NO"="T"."AC_NO" AND "AH"."BR"="T"."BR")                                                                     
   8 - filter(("AH"."EVENT"<>'REVL' AND "AH"."C"="T"."C"))

As you see, 7.14 minute was spend to execute the SQL and the most of time was consumed to retrieve data from TRANSACTION_HISTORY the table. In the 8th line of the execution plan, the database spent 7.09 second, it means 7.9/7.14*100=99.30%, so almost all time was spent to scan TRANSACTION_HISTORY table. Why such a big time was required to retrieve rows from the transaction table?
It is obvious that the index IX_TR_HISTORY is not an unselective. According the execution plan, in the line 5 we see there are 85 accounts in the ACCCOUNT table for the given customer. For the each account the database scanned IX_TR_HISTORY index (so 85 times) and got 121K rows. That is why the line 8 was executed 121K times. But the retrieving the appropriate data from INDEXES were not consumed lots of time. What is the problem there and how can we solve it?

As you see from the 8th line of the execution plan, the database did 75644 physical reads in order to extract 120K rows from the table via index ROWID. Actually, a lot of physical reads required such amount of time. So what do we have to do now? How could the response time be reduced in this case?
As we noted before, there can be several account number for a customer and for each account there can be thousands and even millions of transactions. Do we have to scan all transactions to find out last 15 of them? Can we avoid scanning all transactions in this case?

Actually, as execution plan says, after join of two tables, the database performs SORT operation and selects last 15 transactions. We need to find a way to avoid complete scan of all rows (transactions of the accounts). My first assumption was using INDEX (walking DESCENDING order and stop after retrieving 15 entries) to find the latest transactions without reading all of them. In reality, we do not need to process all transactions of the account. How can we rewrite the original SQL to achieve that?

If we look at the data of our tables (above execution plan also tells us about the data) it can be seen that there are limited number of accounts (in our case it is 85), but there are lots of transactions. There is another question. How much data is located in ACCOUNTS table that customers have less than 10 accounts?

SQL> SELECT   ROUND (cnt10 / total * 100, 2) "less than 10 %"
      FROM   (SELECT   SUM (CASE WHEN cnt <= 10 THEN cnt ELSE 0 END) cnt10,
                   SUM (cnt) total
            FROM   (  SELECT                               /*+ parallel(10) */
                            cust_no, COUNT ( * ) cnt
                        FROM   accounts
                       WHERE   a_stat = 'A' AND r_stat = 'O'
                    GROUP BY   cust_no));  

less than 10 %
--------------
         98.94

SQL>

So approximately 99% table data consumed by the customers that have less than 10 number of accounts, it gives me an idea to do next step.
If we talk about joining of two tables here, to get the ORDERED data, we have to join all rows, because there is no guarantee that the join of INDEX RANGE SCAN DESCENDING/TABLE ACCESS BY INDEX ROWID operators will give us ORDERED result and then we can limit it by ROWNUM clause or analytic function. In principle, as we see there are limited number of accounts per customers (mostly ten accounts per customers), but the transactions per account could increase over the time passed like thousands or millions. That why I decided to write a procedure/function to generate simple SQL that provide the result I want, then I can open cursor for that generated SQL and can fetch required data.

DECLARE
    custno IN VARCHAR2(100):='SSSS'; -- this is a given customer number.
    mysql   CLOB; 
    v_cur     SYS_REFCURSOR;
BEGIN
    FOR c
    IN (SELECT t.BR
                    ,t.ACC_NO
                    ,t.C
                FROM Accounts  t
               WHERE t.Cust_No = custno
                 AND t.A_Stat = 'A'
                 AND t.R_Stat = 'O'
                 AND t.Type = 'U')
    LOOP
        IF mysql is null then
        THEN
            mysql :=
                'select * from (SELECT /*+ index_desc ah IX_TR_HISTORY) */ 
                       Ah.TRN_NO
                      ,Ah.TRN_DATE
                      ...
                       FROM  transaction_history AH WHERE Ah.Event <> '
                || CHR (39)|| 'REVL'|| CHR (39)|| ' AND Ah.Ac_No ='
                || CHR (39)|| c.ac_no || CHR (39)|| ' and Ah.BR='
                || CHR (39)|| c.BR|| CHR (39)|| ' and  Ah.C='
                || CHR (39)|| c.c|| CHR (39)
                || ' order by TRN_DT desc)   where rownum<=15';
        ELSE
            mysql :=
                mysql || CHR (10) || ' union all ' || CHR (10)
                || 'select * from (SELECT /*+ index_desc(ah IX_TR_HISTORY) */ 
                       Ah.TRN_NO
                      ,Ah.TRN_DATE
                      ...
                       FROM  transaction_history AH WHERE Ah.Event <> '
                || CHR (39)|| 'REVL'|| CHR (39)|| ' AND Ah.Ac_No ='
                || CHR (39)|| c.ac_no|| CHR (39)|| ' and Ah.BR='
                || CHR (39)|| c.BR|| CHR (39)|| ' and  Ah.C='
                || CHR (39)|| c.c || CHR (39)
                || ' order by TRN_DT desc)   where rownum<=15';
        END IF;
    END LOOP;

    mysql :=
           'select * from (select * from ('
        || mysql
        || ') ORDER BY Trn_Dt DESC) where rownum<=15';
--insert into t values(MYSQL);
 OPEN v_cur FOR  mysql;
.....
close v_cur;
---select * from t;

END;

It just took less than half a second to complete.

What have we done here? We went each account number and produced simple SQLs that support exact INDEX RANGE SCAN DESCENDING operation which helps us to avoid sort operation and scanning unnecessary rows/transactions. And we have combined all SQL s with UNION ALL clause. As a result we have got following SQL statement.

SELECT   *
  FROM   (  SELECT   *
              FROM   (SELECT   *
                        FROM   (  SELECT   /*+ index_desc(ah IX_TR_HISTORY) */  
                                           Ah.TRN_NO
                                          ,Ah.TRN_DATE
                                          ...

                                    FROM   transaction_history  ah
                                   WHERE       ah.event <> 'REVL'
                                           AND ah.ac_no = 'QQQQQQQQQQQQQQQQQQQ' 
                                           AND ah.br = '110'
                                           AND ah.c = 'UUU'
                                ORDER BY   trn_date DESC)
                       WHERE   ROWNUM <= 15
                      UNION ALL
                      SELECT   *
                        FROM   (  SELECT   /*+ index_desc(ah IX_TR_HISTORY) */  
                                           Ah.TRN_NO
                                          ,Ah.TRN_DATE
                                          ...

                                    FROM   transaction_history  ah
                                   WHERE       ah.event <> 'REVL'
                                           AND ah.ac_no = 'AAAAAAAAAAAAAAAAAAA'
                                           AND ah.br = '110'
                                           AND ah.c = 'RRR'
                                ORDER BY   trn_date DESC)
                       WHERE   ROWNUM <= 15
                       UNION ALL
                      ...
                      SELECT   *
                        FROM   (  SELECT   /*+ index_desc(ah IX_TR_HISTORY) */  
                                           Ah.TRN_NO
                                          ,Ah.TRN_DATE
                                          ...

                                    FROM   transaction_history  ah
                                   WHERE       ah.event <> 'REVL'
                                           AND ah.ac_no = 'DDDDDDDDDDDDDDDDDDD' 
                                           AND ah.br = '154'
                                           AND ah.c = 'AAA'
                                ORDER BY   trn_date DESC)
                       WHERE   ROWNUM <= 15)
          ORDER BY   trn_date DESC)
 WHERE   ROWNUM <= 15;

And its execution plan

--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                          |      1 |        |     15 |00:00:12.82 |    2355 |   2166 |       |       |          |
|*  1 |  COUNT STOPKEY                      |                          |      1 |        |     15 |00:00:12.82 |    2355 |   2166 |       |       |          |
|   2 |   VIEW                              |                          |      1 |    139 |     15 |00:00:12.82 |    2355 |   2166 |       |       |          |
|*  3 |    SORT ORDER BY STOPKEY            |                          |      1 |    139 |     15 |00:00:12.82 |    2355 |   2166 |  4096 |  4096 | 4096  (0)|
|   4 |     VIEW                            |                          |      1 |    139 |    977 |00:00:12.82 |    2355 |   2166 |       |       |          |
|   5 |      UNION-ALL                      |                          |      1 |        |    977 |00:00:12.82 |    2355 |   2166 |       |       |          |
|*  6 |       COUNT STOPKEY                 |                          |      1 |        |      2 |00:00:01.27 |     168 |    175 |       |       |          |
|   7 |        VIEW                         |                          |      1 |      1 |      2 |00:00:01.27 |     168 |    175 |       |       |          |
|*  8 |         TABLE ACCESS BY INDEX ROWID | TRANSACTION_HISTORY      |      1 |      1 |      2 |00:00:01.27 |     168 |    175 |       |       |          |
|*  9 |          INDEX RANGE SCAN DESCENDING| IX_TR_HISTORY            |      1 |      1 |    163 |00:00:00.08 |       6 |     13 |       |       |          |
|* 10 |       COUNT STOPKEY                 |                          |      1 |        |      1 |00:00:00.01 |       5 |      2 |       |       |          |
|  11 |        VIEW                         |                          |      1 |      1 |      1 |00:00:00.01 |       5 |      2 |       |       |          |
|* 12 |         TABLE ACCESS BY INDEX ROWID | TRANSACTION_HISTORY      |      1 |      1 |      1 |00:00:00.01 |       5 |      2 |       |       |          |
|* 13 |          INDEX RANGE SCAN DESCENDING| IX_TR_HISTORY            |      1 |      1 |      1 |00:00:00.01 |       4 |      2 |       |       |          |
|* 14 |       COUNT STOPKEY                 |                          |      1 |        |     15 |00:00:00.06 |      17 |     13 |       |       |          |
...
|*342 |       COUNT STOPKEY                 |                          |      1 |        |     15 |00:00:00.10 |      19 |     15 |       |       |          |
| 343 |        VIEW                         |                          |      1 |      1 |     15 |00:00:00.10 |      19 |     15 |       |       |          |
|*344 |         TABLE ACCESS BY INDEX ROWID | TRANSACTION_HISTORY      |      1 |      1 |     15 |00:00:00.10 |      19 |     15 |       |       |          |
|*345 |          INDEX RANGE SCAN DESCENDING| IX_TR_HISTORY            |      1 |      1 |     15 |00:00:00.01 |       4 |      0 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                                       
Predicate Information (identified by operation id):                                                                                                    
---------------------------------------------------                                                                                                    
                                                                                                                                                       
   1 - filter(ROWNUM<=15)                                                                                                                              
   3 - filter(ROWNUM<=15)                                                                                                                              
   6 - filter(ROWNUM<=15)                                                                                                                              
   8 - filter(("AH"."C"='UUU' AND "AH"."EVENT"<>'REVL'))                                                                                          
   9 - access("AH"."AC_NO"='QQQQQQQQQQQQQQQQQQQ' AND "AH"."BR"='110')                                                                          
  10 - filter(ROWNUM<=15)                                                                                                                              
  12 - filter(("AH"."C"='RRR' AND "AH"."EVENT"<>'REVL'))                                                                                          
  13 - access("AH"."AC_NO"='AAAAAAAAAAAAAAAAAAA' AND "AH"."BR"='110')                                                                          
  14 - filter(ROWNUM<=15)                                                                                                                              
  ....
 341 - access("AH"."AC_NO"='EEEEEEEEEEEEEEEEEEE' AND "AH"."BR"='167')                                                                          
 342 - filter(ROWNUM<=15)                                                                                                                              
 344 - filter(("AH"."C"='AAA' AND "AH"."EVENT"<>'REVL'))                                                                                          
 345 - access("AH"."AC_NO"='DDDDDDDDDDDDDDDDDDD' AND "AH"."BR"='154')                                                                          

So, in this case we have only scanned 977 rows instead of 121K in order to get last fifteen transactions. Also, the database did 2166 physical request instead of 76419. And the execution time now is 13 second instead of 7.14 minute. As a result we have got 7.14*60/13=33 times faster query. Just remember that in both test cases I have flushed database`s and Linux cache to compare the performance results.

I want to note that the INDEX_DESC hint should be explicitly set in the SQL, because there could be cases the cost of INDEX RANGE scan operation could be equal (or less than) to that of INDEX RANGE SCAN DESCENDING operation and oracle favors to choice the previous ones. And in this case you would see additional SORT ORDER BY STOPKEY optimizer operation in the execution plan.

Summary

In SQL Tuning task, generally we try to find an execution plan/path that gives the database to proceed minimum rows. It is the goal of SQL tuning and query optimizer always try to find such a plan. All optimization techniques (query transformation, join order, access paths) are used by optimizer are designed to find a way to minimize row processing and send the next step of execution plan minimum rows early as much as possible. But sometimes we can approach to the task with quite different ways like rewriting SQL, execute different parts of the given SQL separately and join all small parts together in order to produce the final result with minimum rows processing.

April 30, 2018

A Real life performance issue due to Adaptive Dynamic Sampling

Filed under: Adaptive Sampling,SPD — Chinar Aliyev @ 7:15 am

A couple of weeks ago, I was asked to improve the performance of a particular SQL query which took, in average, 4 to 6 seconds to complete. Here’s below the corresponding SQL Monitoring Report when hinted with the /*+ monitor */ hint taken from a 12.1.0.2 Oracle instance running under RED hat EL7:

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)      
 Instance ID         :  1                    
 Session             :  ...(4199:16935) 
 SQL ID              :  09a03j4w71zd2        
 SQL Execution ID    :  16777216             
 ...
 Duration            :  .43294s              
 Module/Action       : ...   
 Service             : ...               
 Program             : ...     
 Fetch Calls         :  1                    

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    0.43 |    0.43 |     0.01 |     1 |  16775 |
=================================================

Interestingly, while the execution time of this query is really 4 seconds, the above report shows only 0.43 second for the total duration of this query. So the first fundamental question raised by this issue is: where are those 4 seconds coming from?

The SQL query Statement is constructed dynamically by the application using appropriate values passed by users through web application.

To understand exactly what is happening behind the scenes I have enabled SQL Trace and TKPROFED the resulting trace file displayed below:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.53       0.54          0         13          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.42       0.43          0      16775          0          65
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.96       0.97          0      16788          0          65

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 2601  
Number of plan statistics captured: 1

In contrast to the 0.43 seconds of elapsed time reported by the SQL Monitor report, TKPROF shows a total elapsed time of 0.96 second. In fact Oracle, practically, did not spend any time during the execution phase of this query and has consumed only 0.42 second during the fetch phase. However most of the query time was consumed during the parse step. If we go back to the global statistics provided by the SQL monitor report we will find that the reported duration time is of 0.43 second. Can we then infer that the duration time provided by the SQL monitor will always correspond to the query execution time?

In fact it is probably correct to say that the duration time in the SQL monitor represents the sum of the execute and fetch phase’s execution time. And, as such, it does not include the parse time information (Since the query took 4 seconds why Tkprof is showing only 0.54 sec at parse phase).

In addition, as you can see, the SQL Monitor reports, shows 16775 buffer gets while the TKPROF report the same number of buffer gets but in the fetch phase. Finally, it becomes clear that the initial performance issue turned to be a parsing problem which we have to address. So the second question is what has made the parsing time so long?

As shown below in the TKPROFED trace file, there are a lot of SQL queries which contain the SAMPLE clause and “DS_SVC” comment :

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  */ SUM(C1) 
FROM
 (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "K")  */ 1 AS C1 FROM 
  T2 SAMPLE BLOCK(0.101744, 8) SEED(1)  "K" WHERE 
  ("K"."BR"=1)) innerQuery
 

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  OPT_ESTIMATE(@"innerQuery", TABLE, "K", ROWS=17185285.24) */ C1, C2, C3 
FROM
 (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "K")  */ 4294967295 AS C1, 
  COUNT(*) AS C2, SUM(CASE WHEN ("K"."BRANCH"=1) THEN 1 ELSE 0 END) AS C3  
  FROM T2 SAMPLE BLOCK(0.101744, 8) SEED(1)  "K" WHERE 
  ("K"."BR"=1)) innerQuery

The above trace file queries indicate that Oracle has executed adaptive dynamic sampling queries in order to create a more efficient execution plan as a part of the adaptive query optimization. Actually, the most query time was consumed during the execution of these adaptive dynamic sampling queries. In the next section of this article we will see what these ADS queries are and why they have been used?

The SQL text of the problematic query is a little bit big, therefore I am not going to post it here. However, it is worth mentioning that it contains several scalar subqueries shown below:

SELECT  t1.c1,
.........................
             || t1.c1
             || NVL (
                   (SELECT K.TERML
                      FROM t2 K   
                     WHERE     K.BR = 1
                           AND K.DC =
                                  (SELECT T.cn
                                     FROM t3 T
                                    WHERE T.BR = 1 AND T.DC = TE.DC)),
                   ', ')
       END
          A1,
....................................
      NVL (
          (SELECT t4.VALUE
             FROM t4  
            WHERE     t4.dc = TE.dc
                  AND t4.dbt_code = 728
                  AND (   t4.crdacc IN
                             ('' || 'SSSSSSSSSSSSSSSSS' || '')
                       OR t4.dbtacc IN
                             ('' || 'SSSSSSSSSSSSSSSSS' || ''))
                  AND ROWNUM = 1),
          0)
          FEE, 
          ..............................
      NVL (
          (SELECT t5.VALUE
             FROM t5  
            WHERE     t5.dc = TE.dc
                  AND t5.dbt_code = 533
                  AND (   t5.crdacc IN
                             ('' || 'SSSSSSSSSSSSSSSSS' || '')
                       OR t5.dbtacc IN
                             ('' || 'SSSSSSSSSSSSSSSSS' || ''))
                  AND ROWNUM = 1),
          0)
          CASHBACK  
  FROM                                                            
      T1 TE 
       LEFT JOIN t6 
          ON .....................
       LEFT JOIN .....................
       (SELECT SUB Query with joins and SET Operators
       ) TR
         ........................
 WHERE     (   te.dbtacc IN ('' || 'SSSSSSSSSSSSSSSSS' || '')
            OR te.crdacc IN ('' || 'SSSSSSSSSSSSSSSSS' || ''))
       AND te.vldat BETWEEN TO_DATE (...)
                                    AND TO_DATE (...)
       AND TE.VALUE  0
       AND TE.dbtcode IN (...);   
  

And the execution plan of the query is:

                                                                                             
Plan hash value: 2881986075                                                                                                
                                                                                                                           
---------------------------------------------------------------------------------------------------------------------      
| Id  | Operation                                  | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |      
---------------------------------------------------------------------------------------------------------------------      
|   0 | SELECT STATEMENT                           |                        | 54979 |   156M|  2710K  (1)| 00:01:46 |      
|   1 |  TABLE ACCESS BY INDEX ROWID               | T2                     |     1 |    42 |     3   (0)| 00:00:01 |      
|*  2 |   INDEX UNIQUE SCAN                        | UK_T2                  |     1 |       |     2   (0)| 00:00:01 |      
|   3 |    TABLE ACCESS BY INDEX ROWID             | T3                     |     1 |    17 |     2   (0)| 00:00:01 |      
|*  4 |     INDEX UNIQUE SCAN                      | PK_T3                  |     1 |       |     1   (0)| 00:00:01 |      
|   5 |  TABLE ACCESS BY INDEX ROWID BATCHED       | T4                     |     1 |    24 |     2   (0)| 00:00:01 |      
|*  6 |   INDEX SKIP SCAN                          | UK_T4                  |     1 |       |     1   (0)| 00:00:01 |      
...
|  42 |  COUNT STOPKEY                             |                        |       |       |            |          |      
|* 43 |   TABLE ACCESS BY INDEX ROWID BATCHED      | T5                     |     1 |    51 |     5   (0)| 00:00:01 |      
|* 44 |    INDEX SKIP SCAN                         | T5_ED                  |     1 |       |     4   (0)| 00:00:01 |      
...
|* 51 |  FILTER                                    |                        |       |       |            |          |      
|* 52 |   HASH JOIN RIGHT OUTER                    |                        | 54979 |   156M|  2080   (3)| 00:00:01 |      
|* 53 |    TABLE ACCESS FULL                       | T6                     |   277 |  7756 |     6  (34)| 00:00:01 |      
|  54 |    NESTED LOOPS OUTER                      |                        | 54979 |   155M|  2074   (3)| 00:00:01 |      
|* 55 |     HASH JOIN OUTER                        |                        |    64 |   183K|  1882   (3)| 00:00:01 |      
|  56 |      NESTED LOOPS OUTER                    |                        |    64 |   180K|  1655   (1)| 00:00:01 |      
|* 57 |       HASH JOIN OUTER                      |                        |    64 |  8896 |    47   (9)| 00:00:01 |      
|* 58 |        TABLE ACCESS BY INDEX ROWID BATCHED | T1                     |    73 |  5183 |    43   (5)| 00:00:01 |      
|  59 |         BITMAP CONVERSION TO ROWIDS        |                        |       |       |            |          |      
|  60 |          BITMAP OR                         |                        |       |       |            |          |      
|  61 |           BITMAP CONVERSION FROM ROWIDS    |                        |       |       |            |          |      
|  62 |            SORT ORDER BY                   |                        |       |       |            |          |      
|* 63 |             INDEX RANGE SCAN               | IDX_T1_DBT             |       |       |     4   (0)| 00:00:01 |      
|  64 |           BITMAP CONVERSION FROM ROWIDS    |                        |       |       |            |          |      
|  65 |            SORT ORDER BY                   |                        |       |       |            |          |      
|* 66 |             INDEX RANGE SCAN               | IDX_T1_CRD             |       |       |     4   (0)| 00:00:01 |      
...                                                                                                                           
Predicate Information (identified by operation id):                                                                        
---------------------------------------------------                                                                        
...                                                                                                                           
Note                                                                                                                       
-----                                                                                                                      
   - dynamic statistics used: dynamic sampling (level=2)                                                                   
   - this is an adaptive plan                                                                                              
   - 7 Sql Plan Directives used for this statement   

Thanks to the above Note we know now that Oracle has used 7 SQL Plan Directives.

Now let’s ask an interesting question: since the statistic are not stale , why the database has then used ADS (Adaptive Dynamic Sampling) in this case?

The reason is very probably not related to the original query itself. ADS has been triggered for a portion of this query for which Oracle has found, during the parse phase, the existence of SPD. These SPD have been created for portions of another different SQL query which match portions of our problematic SQL statement.

Furthermore, we know that from 12c onward, for every execution of SQL statement, the Query engine monitors and compares cardinality estimation with actual rows of each nodes (starts from child nodes of each parent) of the execution plan. So, if there are significant differences between actual and estimated cardinalities then Oracle will create a SQL Plan directive for each node of this execution plan. Thanks to these SPD the Oracle optimizer knows that some nodes of the execution plan have bad cardinality estimations which need to be improved. It is also very possible that Oracle will create a lot of SPD influencing directly execution plan and therefore execution time during next run of one or many resembling queries including newly created ones. This is why having fresh object statistics doesn’t necessarily mean that Oracle will always come up with good cardinality estimations. This is where SPD enters in action and starts to be used. Before explaining the nature of AS queries let us see following lines from SQL Trace/TKPROF

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.53       0.54          0         13          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch        1      0.42       0.43          0      16775          0          65
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        5      0.96       0.97          0      16788          0          65

Misses in library cache during parse: 1


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       59      0.06       0.06          0          0          0           0
Execute    500      0.00       0.00          0          0          0           0
Fetch      500      3.13       3.13          0      91230          0         498
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1059      3.20       3.20          0      91230          0         498

Misses in library cache during parse: 10

   31  user  SQL statements in session.
    3  internal SQL statements in session.
   34  SQL statements in session.

As you can see, the original SQL statement took 0.96 second and consumed 16788 buffer gets. But, 3.2 (more than three times) second was spent for recursive SQL statements so AS queries and 91230 (more than 5 times) buffer gets processed for them.

It looks SQL Monitor does not include parse time information, it covers execution and fetch phase. I think it is expected behavior because, SQL Monitor is designed to gather and interpret SQL execution statistics for long running queries and in general parse time is very less comparing with execution time. Also, SQL Execution ID is constructed and recorded only when SQL started for execution and it is a key for SQL Monitor.

Now, let’s see the AS statements that I extracted from SQL Trace/TKPROF to understand why and what purpose they was used for.

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  */ SUM(C1) 
FROM
 (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "K")  */ 1 AS C1 FROM 
  T2 SAMPLE BLOCK(0.101744, 8) SEED(1)  "K" WHERE 
  ("K"."BR"=1)) innerQuery
 

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  OPT_ESTIMATE(@"innerQuery", TABLE, "K", ROWS=17185285.24) */ C1, C2, C3 
FROM
 (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "K")  */ 4294967295 AS C1, 
  COUNT(*) AS C2, SUM(CASE WHEN ("K"."BRANCH"=1) THEN 1 ELSE 0 END) AS C3  
  FROM T2 SAMPLE BLOCK(0.101744, 8) SEED(1)  "K" WHERE 
  ("K"."BR"=1)) innerQuery

These both queries were used to estimate cardinality and access costing for FTS operation. As you see, adaptive sampling was applied for scalar subqueries also. Why? It is a scalar subquery and returns only one row, why did we see AS for that here? The answer is that optimization is performed query block level also, each query block is evaluated separately and need to generate efficient access paths for them.(but heuristic approach also could be applied for specific subqueries, so really AS is required or not?). But we also see following statements:

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  OPT_ESTIMATE(@"innerQuery", TABLE, "K", ROWS=17185285.24) 
  OPT_ESTIMATE(@"innerQuery", INDEX_SKIP_SCAN, "K", "IDX_T1$001", 
  ROWS=17185285.24) OPT_ESTIMATE(@"innerQuery", INDEX_FILTER, "K", 
  "IDX_T2$001", ROWS=17185285.24) */ C1, C2, C3 
FROM
 (SELECT /*+ qb_name("innerQuery") INDEX( "K" "T1")  */ 
  COUNT(*) AS C1, 4294967295 AS C2, COUNT(*) AS C3  FROM T2 
  "K" WHERE ("K"."BRANCH"=1)) innerQuery
  
  
  SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  OPT_ESTIMATE(@"innerQuery", TABLE, "K", ROWS=17185285.24) 
  OPT_ESTIMATE(@"innerQuery", INDEX_SKIP_SCAN, "K", "IDX_T2$001", 
  ROWS=17185285.24) OPT_ESTIMATE(@"innerQuery", INDEX_FILTER, "K", 
  "IDX_T2$001", ROWS=17185285.24) */ C1, C2, C3 
FROM
 (SELECT /*+ qb_name("innerQuery") INDEX( "K" "IDX_T2_$002")  */ COUNT(*) 
  AS C1, 4294967295 AS C2, COUNT(*) AS C3  FROM T2 "K" WHERE 
  ("K"."BRANCH"=1)) innerQuery

DSQ 1.2

What do they mean? We see Adaptive Optimization performed some analysis, if there are indexes then tried to compute cardinality and cost for INDEX operations (INDEX RANGE, SKIP SCAN, and INDEX FULL SCAN). If there are several indexes then we could see several AS queries for appropriate indexes. It should be noted that Oracle does not use SAMPLING if there are indexes (so when index access path available). Oracle uses same types of DSQ for the T4 and the T5

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  */ C1, C2, C3 
FROM
 (SELECT /*+ qb_name("innerQuery") INDEX( T6 "IDX_T5_DBT")  */ 
  COUNT(*) AS C1, 4294967295 AS C2, COUNT(*) AS C3  FROM T5 
  "T5" WHERE ("T5"."DBT"='SSSSSSSS')) innerQuery;
   select * from table(dbms_xplan.display);

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  OPT_ESTIMATE(@"innerQuery", INDEX_SCAN, "T5", "IDX_T5", 
  ROWS=653) */ C1, C2, C3 
FROM
 (SELECT /*+ qb_name("innerQuery") INDEX( "T5" "IDX_T5_DBT")  */ 
  COUNT(*) AS C1, 4294967295 AS C2, COUNT(*) AS C3  FROM T5 
  "T5"  WHERE ("T5"."DBT"='SSSSSSS')) innerQuery; 

And finally, for T1 table

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  */ SUM(C1) 
FROM
 (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TE")  */ 1 AS C1 FROM 
  T1 SAMPLE BLOCK(0.0107362, 8) SEED(1)  "TE" WHERE 
  ("TE"."DBTACC"='...' OR "TE"."CRDACC"=
  '...') AND ("TE"."VALUE"0) AND ("TE"."VALUEDATE"&gt;=...

Oracle increased sample size from 0.0107362 to 0.687115 during the optimization phase due to insufficient samples. Sample size was increased by double size like 0.0107362, 0.0214723, 0.0429447,…, 0.687115. AS statements was executed several times as a result the response time was increased dramatically.

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring 
  optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) 
  */ SUM(C1) 
FROM
 (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TE")  */ 1 AS C1 FROM 
  T1 SAMPLE BLOCK(0.687115, 8) SEED(1)  "TE" WHERE 
  ("TE"."DBTACC"='...' OR "TE"."CRDACC"=
  '...') AND ("TE"."VALUE"0) AND ("TE"."VALUEDATE"&gt;=...

To solve the problem we can do following changes:

--Disable using SPD
alter session set "_optimizer_dsdir_usage_control"=0; 
-- Disable creation of SPD
alter session set "_sql_plan_directive_mgmt_control"=0;

There is another question. Oracle Database 12c R1 uses AS if there is significant cardinality misestimating, even you do not have stale object statistics. It means, in such cases, optimizer thinks that adaptive statistics are more efficient than object statistics. But, how is it efficient by getting statistics through AS queries? There are some obvious issues that I wrote in three articles four years ago:

https://community.toadworld.com/platforms/oracle/w/wiki/11022.automaticadaptive-dynamic-sampling-in-oracle-12c-part-1

https://community.toadworld.com/platforms/oracle/w/wiki/11036.automaticadaptive-dynamic-sampling-in-oracle-12c-part-2

https://community.toadworld.com/platforms/oracle/w/wiki/11052.automaticadaptive-dynamic-sampling-in-oracle-12c-part-3

Note: If we set optimizer_dynamic_sampling=11 then the database will use AS for all possible SQL Plan operations and this is auto mode, but in the default optimizer does execute AS queries for specific SQL constructions(or plan operations) of the statements when SPD available.

Summary.

As a part of adaptive optimization, Oracle Database 12c (r1) fully relies on adaptive sampling to validate precomputed object statistics. The techniques is used when the database considers or detects possibility of misestimate of cardinality for a particular plan operation. To detect cardinality misestimating there SPD was introduced. They were designed for a plan operations or SQL constructs, so not for particular SQL statements. In this case when database parse a statement, it searches plan directives for the SQL constructs of the statement in SPD repository. If found the according SPD, database need to perform adaptive sampling in order to estimate cardinality and cost for access paths. As a part of AS the database also use indexes to costing INDEX operations and in this case sampling is not used (due to availability efficient access path –index scan). So it means the database workload triggers doing AS for different SQL statements and can cause performance issues, that is why (the reason I think) in Oracle database 12c r2, adaptive statistics is disabled by default. Oracle database still continue to create SPD but does not use AS queries until optimizer_adaptive_statistics=true, its default value is false. Finally, we saw that SQL Monitor does not contain the information about the parse phase of SQL statements.

Create a free website or blog at WordPress.com.