Chinar Aliyev`s blog

April 8, 2019

The Performance of Parallel Hash Join and Automatic Skew Handling

Filed under: CBO,Join,Parallel Execution,Skew,Tuning — Chinar Aliyev @ 6:32 am

As it is known that HYBRID HASH distribution method (DM) is designed to prevent broadcasting “huge” number of rows from the left row source of Parallel Join (PJ). Although Oracle favors to redistribute only few numbers of rows via BROADCAST DM but the HYBRID HASH SKEW DM distributes “popular” values via BROADCAST in the case of skew detection in data of probe table. So, if the popular values consume significant portion of build table and high degree of parallelism is requested then depending on your data(volume) and SQL statement the Parallel Hash Join (PHJ) performance can be worse. Before introducing the test case I want to share some new changes have been done in the recent Oracle Database versions that related to Parallel Join Skew Handling (PJSH).
First of all, If there is a histogram for join column (probe table), based on statistics and hidden parameters (_px_join_skew_minfreq, _px_join_skew_ratio see Randolf`s blog post, the article is based on 12cR1) the database identifies that there is not “enough” skew to enable skew handling, then you cannot enable PJSH even via PQ_SKEW.
But there can be cases which statistics might not describe the data properly and you want to enable PJSH. For this purpose starting from Oracle Database 12cR2 there is a new parameter _px_join_skewed_values_count . The parameter allows us to set top-n frequent values (not popular values, in contrast to decision based on histogram) for optimizer to consider them as skewed values. The default value is 0. If we set the value different than 0, then Oracle will generate top-n frequent values using dynamic sampling queries (DSQ) and will distribute the values via BROADCAST DM for build table and will read the values ROUND-ROBIN fashion from PROBE table. In addition, the mentioned enhancement has been improved in Oracle 18c, so that we are allowed even to set sampling percent (and also time restriction in order to reduce parse time when required) for DSQ (_px_join_skew_sampling_percent, _px_join_skew_sampling_time_limit).

As Randolf has explained in his blog post after skew detection the database does execute dynamic sampling query in order to find exact values. But Oracle Database 18c afterward it is not always needed to generate DSQ to find the exact skewed values. Oracle finds them from histogram when it is possible (mostly possible, in the case of the full values are stored in the histograms) it is controlled by the _px_join_skew_use_histogram parameter and default value is TRUE.
Now let`s see the below test case:

-- Test system:
-- OS: OEL7 64bit,RAM=187GB, CPU=48
-- DB: Oracle 18.0.0, SGA=20GB, PGA=10G

create table t1 
as
select
        case when level=1 then 1  
             when level>1 and  level<=10 then 2
             when level>10 and level<=300000 then 3
             when level>300000 and level<=650000 then 4
             when level>650000 and level<=1e6 then 5
             else level end c1,
             trunc(sysdate)+mod(level, 2) dat
      , case when mod(level, 2)=0 then 'X' else 'Y' end typ,
      lpad('x',1e4,'*') f1
      ,lpad('x',1e4,'x') f2
from
        dual
connect by
        level <= 1.3e6
;

create table t2 
as
select
        case when level<=100000 then 2
             when level>100000 and level<=200000 then 3
             when level>200000 and level<=300000 then 4
             when level>300000 and level<=400000 then 5
             when level>400000 and level<=500000 then 1
             else level end c1
            ,case when level<=100000 then 'B'
             when level>100000 and level<=200000 then 'C'
             when level>200000 and level<=300000 then 'D'
             when level>300000 and level<=400000 then 'E'
             when level>400000 and level<=500000 then 'A'
             else '0' end chrtp
             ,trunc(sysdate)+mod(level, 2) dat
            ,case when mod(level, 2)=0 then 'X' else 'Y' end typ
            ,lpad('x',1e4,'*') f1
            ,lpad('x',1e4,'x') f2
from
        dual
connect by
        level <= 2e6
;


-- In fact statistics for t1 table may have not been gathered but histogram is necessary for t2 table
execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1',degree=>6);
execute dbms_stats.gather_table_stats(user,'t2',method_opt=>'for all columns size 1 for columns c1 size skewonly',degree=>6);

select table_name tab ,column_name col, histogram hist from user_tab_col_statistics
where  table_name in ('T2','T1')
and column_name='C1'
order by 1;
TAB        COL        HIST
---------- ---------- ----------
T1         C1         NONE
T2         C1         HYBRID	

The size of segments:

select segment_name,bytes/1024/1024/1024 gb from user_segments where segment_name in ('T1','T2');

SEGMENT_NA         GB
---------- ----------
T1         19.8789063
T2            30.5625

Now our query and explain plan :

EXPLAIN PLAN    FOR
        SELECT   COUNT (ff1), COUNT (f2)
          FROM   (SELECT /*+ parallel(10) */
                        t1 .f1 ff1, t2.*
                    FROM   t1, t2
                   WHERE   t1.c1 = t2.c1
                           AND chrtp NOT IN ('A', 'E', 'D', 'C'));

SELECT   * FROM table (DBMS_XPLAN.display);
Plan hash value: 1705792393
 
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |          |     1 |  8013 |       |   252K  (1)| 00:00:10 |        |      |            |
|   1 |  SORT AGGREGATE                  |          |     1 |  8013 |       |            |          |        |      |            |
|   2 |   PX COORDINATOR                 |          |       |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)           | :TQ10002 |     1 |  8013 |       |            |          |  Q1,02 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE               |          |     1 |  8013 |       |            |          |  Q1,02 | PCWP |            |
|*  5 |      HASH JOIN                   |          |  1545K|    11G|   498M|   252K  (1)| 00:00:10 |  Q1,02 | PCWP |            |
|   6 |       PX RECEIVE                 |          |  1300K|  4965M|       | 78333   (1)| 00:00:04 |  Q1,02 | PCWP |            |
|   7 |        PX SEND HYBRID HASH       | :TQ10000 |  1300K|  4965M|       | 78333   (1)| 00:00:04 |  Q1,00 | P->P | HYBRID HASH|
|   8 |         STATISTICS COLLECTOR     |          |       |       |       |            |          |  Q1,00 | PCWC |            |
|   9 |          PX BLOCK ITERATOR       |          |  1300K|  4965M|       | 78333   (1)| 00:00:04 |  Q1,00 | PCWC |            |
|  10 |           TABLE ACCESS FULL      | T1       |  1300K|  4965M|       | 78333   (1)| 00:00:04 |  Q1,00 | PCWP |            |
|  11 |       PX RECEIVE                 |          |  1508K|  5767M|       |   120K  (1)| 00:00:05 |  Q1,02 | PCWP |            |
|  12 |        PX SEND HYBRID HASH (SKEW)| :TQ10001 |  1508K|  5767M|       |   120K  (1)| 00:00:05 |  Q1,01 | P->P | HYBRID HASH|
|  13 |         PX BLOCK ITERATOR        |          |  1508K|  5767M|       |   120K  (1)| 00:00:05 |  Q1,01 | PCWC |            |
|* 14 |          TABLE ACCESS FULL       | T2       |  1508K|  5767M|       |   120K  (1)| 00:00:05 |  Q1,01 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   5 - access("T1"."C1"="T2"."C1")
  14 - filter("CHRTP"<>'A' AND "CHRTP"<>'E' AND "CHRTP"<>'D' AND "CHRTP"<>'C')
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=6)
   - Degree of Parallelism is 10 because of hint

As it is seen by default the skew has been detected and PJSH enabled. If we look at the optimizer trace file we will see following lines:

kkoBloomFilter: join ndv:0 reduction:0.202548 (limit:0.050000)  rejected because distinct value ratio
skewRatio:10, skewMinFreq:30, minNDV:254, skewThreshold:0.039370
ind:0, csel:0.050916, skew count:1
ind:1, csel:0.048718, skew count:2
ind:2, csel:0.048718, skew count:3
ind:3, csel:0.047802, skew count:4
ind:4, csel:0.046337, skew count:5
ind:5, csel:0.000001, skew count:5
ind:6, csel:0.000001, skew count:5
ind:7, csel:0.000001, skew count:5
ind:8, csel:0.000001, skew count:5
ind:9, csel:0.000001, skew count:5
.................................
ind:253, csel:0.000001, skew count:5
Skewed value count:5 scaling:0 degree:10
kkopqSaveJoinSkewDecision:: Saving join information : qbcname:SEL$F5BB74E1, table name:T2, srcQbcName:SEL$2
kkopqSkewInfo: Set of 5 skewed values retrieved from: histogram
AP: Computing costs for inflection point at min value 0.00

So, in our case the skewed values are extracted from the histogram(hybrid), did not have to be executed DSQ to find them. As it is seen from the trace file, Oracle has looped over the histogram endpoint values and has tried to identify skew values. If selectivity of an endpoint values greater than skewThreshold then the value considered as skewed (or contains _px_join_skew_minfreq percent of total rows). It supports an idea that optimizer can treat non-popular value as skewed in case of small skewThreshold and we can control it via hidden parameters as above mentioned. That is why optimizer calculates selectivity for each endpoint value (ind:0- ind:253).
Let`s see the execution statistics:

SQL Monitoring Report

SQL Text
------------------------------
select count(ff1),count(f2) from ( select /*+ parallel(10) */ t1.f1 ff1,t2.* from t1,t2 where t1.c1=t2.c1 and chrtp not in ('A','E','D','C') )

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)                             
 Instance ID         :  1                                           
 Session             :  .                            
 SQL ID              :  8q3ms7t6cptcz                               
 SQL Execution ID    :  16777217                                    
 Execution Started   :  03/31/2019 17:19:31                         
 First Refresh Time  :  03/31/2019 17:19:31                         
 Last Refresh Time   :  03/31/2019 17:20:53                         
 Duration            :  82s                                         
 Module/Action       :  . 
 Service             :  SYS$USERS                                   
 Program             :  .   
 Fetch Calls         :  1                                           

Global Stats
==============================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Fetch | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs | Bytes | Reqs  | Bytes |
==============================================================================================
|     852 |     320 |      490 |          42 |     1 |     8M | 220K |  90GB |  168K |  40GB |
==============================================================================================

Parallel Execution Details (DOP=10 , Servers Allocated=20)
============================================================================================================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    | Concurrency | Buffer | Read  | Read  | Write | Write |           Wait Events           |
|                |       |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Gets  | Reqs  | Bytes | Reqs  | Bytes |           (sample #)            |
============================================================================================================================================================
| PX Coordinator | QC    |         |    0.09 |    0.09 |     0.00 |             |     14 |     4 | 32768 |       |     . |                                 |
| p000           | Set 1 |       1 |      66 |      23 |       39 |        4.16 |   2664 | 16828 |   4GB | 16816 |   4GB | buffer busy waits (4)           |
|                |       |         |         |         |          |             |        |       |       |       |       | latch: cache buffers chains (2) |
|                |       |         |         |         |          |             |        |       |       |       |       | PX Deq: Table Q Normal (1)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (8)       |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (28)     |
|                |       |         |         |         |          |             |        |       |       |       |       | local write wait (1)            |
| p001           | Set 1 |       2 |      67 |      23 |       39 |        5.08 |   2652 | 16818 |   4GB | 16812 |   4GB | buffer busy waits (3)           |
|                |       |         |         |         |          |             |        |       |       |       |       | latch: cache buffers chains (2) |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (11)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (24)     |
|                |       |         |         |         |          |             |        |       |       |       |       | local write wait (3)            |
| p002           | Set 1 |       3 |      66 |      23 |       39 |        4.35 |   2654 | 16835 |   4GB | 16818 |   4GB | buffer busy waits (3)           |
|                |       |         |         |         |          |             |        |       |       |       |       | latch: cache buffers chains (1) |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (11)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (29)     |
| p003           | Set 1 |       4 |      65 |      22 |       39 |        3.70 |   2672 | 16842 |   4GB | 16829 |   4GB | buffer busy waits (3)           |
|                |       |         |         |         |          |             |        |       |       |       |       | latch: cache buffers chains (3) |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (12)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (20)     |
|                |       |         |         |         |          |             |        |       |       |       |       | local write wait (1)            |
| p004           | Set 1 |       5 |      65 |      23 |       39 |        4.05 |   2637 | 16814 |   4GB | 16814 |   4GB | buffer busy waits (4)           |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (12)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (22)     |
|                |       |         |         |         |          |             |        |       |       |       |       | local write wait (4)            |
| p005           | Set 1 |       6 |      67 |      23 |       39 |        4.73 |   2646 | 16816 |   4GB | 16810 |   4GB | buffer busy waits (3)           |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (9)       |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (25)     |
| p006           | Set 1 |       7 |      66 |      23 |       40 |        4.07 |   2636 | 16811 |   4GB | 16811 |   4GB | buffer busy waits (3)           |
|                |       |         |         |         |          |             |        |       |       |       |       | latch: cache buffers chains (1) |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (11)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (29)     |
|                |       |         |         |         |          |             |        |       |       |       |       | local write wait (2)            |
| p007           | Set 1 |       8 |      66 |      23 |       39 |        4.19 |   2658 | 16827 |   4GB | 16821 |   4GB | buffer busy waits (1)           |
|                |       |         |         |         |          |             |        |       |       |       |       | latch: cache buffers chains (2) |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (11)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (23)     |
|                |       |         |         |         |          |             |        |       |       |       |       | local write wait (3)            |
| p008           | Set 1 |       9 |      65 |      22 |       39 |        3.90 |   2712 | 16860 |   4GB | 16831 |   4GB | buffer busy waits (4)           |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (11)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (27)     |
| p009           | Set 1 |      10 |      66 |      23 |       39 |        4.27 |   2673 | 16838 |   4GB | 16826 |   4GB | buffer busy waits (3)           |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path read temp (10)      |
|                |       |         |         |         |          |             |        |       |       |       |       | direct path write temp (21)     |
|                |       |         |         |         |          |             |        |       |       |       |       | local write wait (1)            |
| p00a           | Set 2 |       1 |      19 |    9.31 |       10 |             |     1M |  6574 |   6GB |       |     . | direct path read (11)           |
| p00b           | Set 2 |       2 |      19 |    8.90 |       11 |             |   729K |  4590 |   4GB |       |     . | direct path read (12)           |
| p00c           | Set 2 |       3 |      20 |    9.44 |       10 |             |   795K |  5137 |   5GB |       |     . | direct path read (10)           |
| p00d           | Set 2 |       4 |      19 |    8.80 |       10 |             |   688K |  4275 |   4GB |       |     . | direct path read (12)           |
| p00e           | Set 2 |       5 |      20 |    9.11 |       11 |             |   777K |  5003 |   5GB |       |     . | direct path read (13)           |
| p00f           | Set 2 |       6 |      20 |      10 |       10 |             |   868K |  5360 |   5GB |       |     . | direct path read (13)           |
| p00g           | Set 2 |       7 |      19 |      10 |     9.26 |             |   961K |  6047 |   6GB |       |     . | direct path read (12)           |
| p00h           | Set 2 |       8 |      20 |      10 |       10 |             |   880K |  5804 |   6GB |       |     . | direct path read (7)            |
| p00i           | Set 2 |       9 |      19 |    8.69 |       10 |             |   697K |  4398 |   4GB |       |     . | direct path read (14)           |
| p00j           | Set 2 |      10 |      20 |      10 |       10 |             |   789K |  5004 |   5GB |       |     . | direct path read (12)           |
============================================================================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=1705792393)
==============================================================================================================================================================================================================
| Id |             Operation              |   Name   |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Write | Write |  Mem  | Temp  | Activity |         Activity Detail          |
|    |                                    |          | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | Reqs  | Bytes | (Max) | (Max) |   (%)    |           (# samples)            |
==============================================================================================================================================================================================================
|  0 | SELECT STATEMENT                   |          |         |       |        52 |    +31 |     1 |        1 |       |       |       |       |     . |     . |    10.37 | buffer busy waits (31)           |
|    |                                    |          |         |       |           |        |       |          |       |       |       |       |       |       |          | latch: cache buffers chains (11) |
|    |                                    |          |         |       |           |        |       |          |       |       |       |       |       |       |          | Cpu (8)                          |
|    |                                    |          |         |       |           |        |       |          |       |       |       |       |       |       |          | local write wait (15)            |
|  1 |   SORT AGGREGATE                   |          |       1 |       |         1 |    +82 |     1 |        1 |       |       |       |       |     . |     . |          |                                  |
|  2 |    PX COORDINATOR                  |          |         |       |         1 |    +82 |    21 |       10 |     4 | 32768 |       |       |     . |     . |          |                                  |
|  3 |     PX SEND QC (RANDOM)            | :TQ10002 |       1 |       |         2 |    +81 |    10 |       10 |       |       |       |       |     . |     . |          |                                  |
|  4 |      SORT AGGREGATE                |          |       1 |       |        29 |    +54 |    10 |       10 |       |       |       |       |     . |     . |          |                                  |
|  5 |       HASH JOIN                    |          |      2M |  252K |        83 |     +0 |    10 |       1M |  168K |  40GB |  168K |  40GB | 759MB |  41GB |    61.40 | Cpu (31)                         |
|    |                                    |          |         |       |           |        |       |          |       |       |       |       |       |       |          | direct path read temp (106)      |
|    |                                    |          |         |       |           |        |       |          |       |       |       |       |       |       |          | direct path write temp (248)     |
|  6 |        PX RECEIVE                  |          |      1M | 78333 |        53 |     +2 |    10 |      10M |       |       |       |       |     . |     . |     1.28 | Cpu (8)                          |
|  7 |         PX SEND HYBRID HASH        | :TQ10000 |      1M | 78333 |        55 |     +0 |    10 |      10M |       |       |       |       |     . |     . |     2.71 | Cpu (17)                         |
|  8 |          STATISTICS COLLECTOR      |          |         |       |        53 |     +2 |    10 |       1M |       |       |       |       |     . |     . |          |                                  |
|  9 |           PX BLOCK ITERATOR        |          |      1M | 78333 |        53 |     +2 |    10 |       1M |       |       |       |       |     . |     . |          |                                  |
| 10 |            TABLE ACCESS FULL       | T1       |      1M | 78333 |        55 |     +0 |   134 |       1M | 20472 |  20GB |       |       |     . |     . |     4.47 | Cpu (15)                         |
|    |                                    |          |         |       |           |        |       |          |       |       |       |       |       |       |          | direct path read (13)            |
| 11 |        PX RECEIVE                  |          |      2M |  121K |        17 |    +54 |    10 |       2M |       |       |       |       |     . |     . |     0.32 | Cpu (1)                          |
|    |                                    |          |         |       |           |        |       |          |       |       |       |       |       |       |          | PX Deq: Table Q Normal (1)       |
| 12 |         PX SEND HYBRID HASH (SKEW) | :TQ10001 |      2M |  121K |        16 |    +54 |    10 |       2M |       |       |       |       |     . |     . |     0.16 | Cpu (1)                          |
| 13 |          PX BLOCK ITERATOR         |          |      2M |  121K |        16 |    +54 |    10 |       2M |       |       |       |       |     . |     . |          |                                  |
| 14 |           TABLE ACCESS FULL        | T2       |      2M |  121K |        17 |    +53 |   132 |       2M | 31720 |  31GB |       |       |     . |     . |    19.30 | Cpu (18)                         |
|    |                                    |          |         |       |           |        |       |          |       |       |       |       |       |       |          | direct path read (103)           |
==============================================================================================================================================================================================================

The duration of the SQL was 82 second. But significant part (more than 50%) of elapsed time was consumed by I/O waits. In the table T1 the number of skewed values is 1M and therefore if you look at the plan line 7, the PX SEND HYBRID HASH operation BROADCASTED these values and we have got 10M rows in that plan line. That huge number of rows (and the volume) caused a problem for the PHJ in the line 5. The database was forced to read (from disk/temp) and write (to temp) huge amount of data (read 90GB/wrote 40GB) in order to complete the HASH JOIN.
Now if we disable Automatic Skew Handling, we will get the followings:

SQL Monitoring Report

SQL Text
------------------------------
select count(ff1),count(f2) from ( select /*+ parallel(10) no_pq_skew(t2) */ t1.f1 ff1,t2.* from t1,t2 where t1.c1=t2.c1 and chrtp not in ('A','E','D','C') )

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)                             
 Instance ID         :  1                                           
 SQL ID              :  7f7u503r8m4pb                               
 SQL Execution ID    :  16777216                                    
 Execution Started   :  03/31/2019 17:22:13                         
 First Refresh Time  :  03/31/2019 17:22:13                         
 Last Refresh Time   :  03/31/2019 17:22:31                         
 Duration            :  18s                                         
.
Global Stats
===============================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency | Fetch | Buffer | Read  | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes | Reqs  | Bytes |
===============================================================================================
|     188 |      70 |      118 |        0.00 |     1 |     8M | 74845 |  56GB | 22650 |   5GB |
===============================================================================================

Parallel Execution Details (DOP=10 , Servers Allocated=20)
=======================================================================================================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    | Concurrency | Buffer | Read | Read  | Write | Write |         Wait Events         |
|                |       |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Gets  | Reqs | Bytes | Reqs  | Bytes |         (sample #)          |
=======================================================================================================================================================
| PX Coordinator | QC    |         |    0.05 |    0.04 |     0.00 |             |     14 |    2 | 16384 |       |     . |                             |
| p000           | Set 1 |       1 |    1.68 |    1.18 |     0.50 |             |      1 |  415 | 101MB |   415 | 101MB | direct path write temp (1)  |
| p001           | Set 1 |       2 |    1.66 |    1.16 |     0.49 |             |      1 |  415 | 101MB |   415 | 101MB |                             |
| p002           | Set 1 |       3 |    1.63 |    1.13 |     0.50 |             |      1 |  412 | 100MB |   411 | 100MB |                             |
| p003           | Set 1 |       4 |    1.63 |    1.16 |     0.47 |        0.00 |      1 |  417 | 101MB |   417 | 101MB | direct path write temp (1)  |
| p004           | Set 1 |       5 |    1.64 |    1.18 |     0.46 |        0.00 |      1 |  417 | 101MB |   417 | 101MB |                             |
| p005           | Set 1 |       6 |      10 |    4.52 |     5.41 |             |      1 | 6326 |   1GB |  6326 |   1GB | direct path write temp (3)  |
| p006           | Set 1 |       7 |    1.62 |    1.18 |     0.44 |        0.00 |      1 |  415 | 101MB |   415 | 101MB |                             |
| p007           | Set 1 |       8 |      10 |    4.42 |     5.41 |        0.00 |      1 | 5548 |   1GB |  5548 |   1GB | direct path read temp (1)   |
|                |       |         |         |         |          |             |        |      |       |       |       | direct path write temp (4)  |
| p008           | Set 1 |       9 |    1.69 |    1.18 |     0.51 |             |      1 |  419 | 101MB |   419 | 101MB | direct path write temp (1)  |
| p009           | Set 1 |      10 |      11 |    5.38 |     5.57 |        0.00 |      1 | 7867 |   2GB |  7867 |   2GB | direct path read temp (1)   |
|                |       |         |         |         |          |             |        |      |       |       |       | direct path write temp (6)  |
| p00a           | Set 2 |       1 |      14 |    4.49 |       10 |             |   725K | 4832 |   5GB |       |     . | direct path read (6)        |
| p00b           | Set 2 |       2 |      15 |    5.03 |       10 |             |   873K | 5692 |   6GB |       |     . | db file sequential read (1) |
|                |       |         |         |         |          |             |        |      |       |       |       | direct path read (8)        |
| p00c           | Set 2 |       3 |      14 |    4.81 |       10 |             |   805K | 5214 |   5GB |       |     . | direct path read (9)        |
| p00d           | Set 2 |       4 |      14 |    4.27 |     9.40 |             |   719K | 4487 |   4GB |       |     . | direct path read (8)        |
| p00e           | Set 2 |       5 |      16 |    4.80 |       11 |             |   800K | 5063 |   5GB |       |     . | direct path read (11)       |
| p00f           | Set 2 |       6 |      14 |    4.94 |     8.73 |             |   915K | 5871 |   6GB |       |     . | direct path read (8)        |
| p00g           | Set 2 |       7 |      14 |    4.76 |     9.26 |             |   893K | 5345 |   5GB |       |     . | direct path read (9)        |
| p00h           | Set 2 |       8 |      16 |    4.85 |       11 |             |   795K | 5148 |   5GB |       |     . | direct path read (8)        |
| p00i           | Set 2 |       9 |      14 |    4.63 |       10 |             |   815K | 5084 |   5GB |       |     . | direct path read (8)        |
| p00j           | Set 2 |      10 |      16 |    4.97 |       11 |             |   864K | 5456 |   5GB |       |     . | direct path read (10)       |
=======================================================================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=2644066362)
====================================================================================================================================================================================================
| Id |           Operation           |   Name   |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Write | Write |  Mem  | Temp  | Activity |       Activity Detail       |
|    |                               |          | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | Reqs  | Bytes | (Max) | (Max) |   (%)    |         (# samples)         |
====================================================================================================================================================================================================
|  0 | SELECT STATEMENT              |          |         |       |         1 |    +18 |     1 |        1 |       |       |       |       |     . |     . |          |                             |
|  1 |   SORT AGGREGATE              |          |       1 |       |         1 |    +18 |     1 |        1 |       |       |       |       |     . |     . |          |                             |
|  2 |    PX COORDINATOR             |          |         |       |         1 |    +18 |    21 |       10 |     2 | 16384 |       |       |     . |     . |          |                             |
|  3 |     PX SEND QC (RANDOM)       | :TQ10002 |       1 |       |         3 |    +16 |    10 |       10 |       |       |       |       |     . |     . |          |                             |
|  4 |      SORT AGGREGATE           |          |       1 |       |         7 |    +12 |    10 |       10 |       |       |       |       |     . |     . |          |                             |
|  5 |       HASH JOIN               |          |      2M |  252K |        18 |     +1 |    10 |       1M | 22650 |   5GB | 22650 |   5GB |   1GB |   6GB |    15.56 | Cpu (3)                     |
|    |                               |          |         |       |           |        |       |          |       |       |       |       |       |       |          | direct path read temp (2)   |
|    |                               |          |         |       |           |        |       |          |       |       |       |       |       |       |          | direct path write temp (16) |
|  6 |        PX RECEIVE             |          |      1M | 78333 |        11 |     +2 |    10 |       1M |       |       |       |       |     . |     . |     2.22 | Cpu (3)                     |
|  7 |         PX SEND HYBRID HASH   | :TQ10000 |      1M | 78333 |        11 |     +2 |    10 |       1M |       |       |       |       |     . |     . |     4.44 | Cpu (6)                     |
|  8 |          STATISTICS COLLECTOR |          |         |       |        11 |     +2 |    10 |       1M |       |       |       |       |     . |     . |     0.74 | Cpu (1)                     |
|  9 |           PX BLOCK ITERATOR   |          |      1M | 78333 |        11 |     +2 |    10 |       1M |       |       |       |       |     . |     . |          |                             |
| 10 |            TABLE ACCESS FULL  | T1       |      1M | 78333 |        12 |     +1 |   134 |       1M | 20472 |  20GB |       |       |     . |     . |    39.26 | Cpu (5)                     |
|    |                               |          |         |       |           |        |       |          |       |       |       |       |       |       |          | direct path read (48)       |
| 11 |        PX RECEIVE             |          |      2M |  121K |         7 |    +12 |    10 |       2M |       |       |       |       |     . |     . |          |                             |
| 12 |         PX SEND HYBRID HASH   | :TQ10001 |      2M |  121K |         5 |    +12 |    10 |       2M |       |       |       |       |     . |     . |     2.22 | Cpu (3)                     |
| 13 |          PX BLOCK ITERATOR    |          |      2M |  121K |         5 |    +12 |    10 |       2M |       |       |       |       |     . |     . |          |                             |
| 14 |           TABLE ACCESS FULL   | T2       |      2M |  121K |         8 |    +10 |   132 |       2M | 31720 |  31GB |       |       |     . |     . |    35.56 | Cpu (10)                    |
|    |                               |          |         |       |           |        |       |          |       |       |       |       |       |       |          | db file sequential read (1) |
|    |                               |          |         |       |           |        |       |          |       |       |       |       |       |       |          | direct path read (37)       |
====================================================================================================================================================================================================

The Join has been completed within 18 second without skew handling. So, four times faster. Oracle needed only 5GB data to write to disk due to true HASH distribution. There can be a question that if we can reduce DOP but keep Auto Skew Handling then read/write (from/to disk) will be reduced and it can improve the PHJ. That is right, but problem is that if we reduce the DOP then we will have less PX producer slaves to scan PHJ inputs (Build, Probe tables) and it also will increase to acquire the data from these tables and will affect the duration of SQL. So, in such situation disabling Auto Skew Handling may give better performance.To add more, if high DOP is requested in this case redistribution cost can be very high.
Lastly, in the query I have added a filter condition as chrtp not in (‘A’,’E’,’D’,’C’). There is a strong correlation between c1 and chrtp columns. Although, the database identified the values c1=1,2,3,4,5 as skewed but they did not need to be redistributed from T1 table. The database currently does not consider filter predicates during the skew detection process. Thus, you know your data well and it may help you to reduce the data that going to be redistributed/joined by adding additional predicate(s) to the probe/build tables.

1 Comment »

  1. […] data (id column), you can see that there is significant skew , however optimizer does not apply the automatic skew handling mechanism to the join. Because there is a relationship between DOP and number of top frequent values (NTFV), […]

    Pingback by Parallel Shared Hash Join | Chinar Aliyev`s blog — April 29, 2019 @ 5:36 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: