The problem was described by Randolf in his article. He used two table and demonstrated that if majority of foreign key values are NULL then Parallel Outer Join (POJ) can suffer the skew related to the NULL values. Generally, in practice foreign keys (majority) are not null values or we can see the symptom very occasionally. But the true NULL value skew is caused by outer join itself when more than two tables are involved. In this case null values are automatically generated due to definition of the join. Starting from Oracle Database 18c the mentioned problem is solved automatically. I did a little change on Randolf`s test case in order to demonstrate the null values handling for his (two table) test case as well as three table join.
The test case:
-- OS:OEL 7.3 64bit, DB:Oracle 18c create table t1 compress as select /*+ use_nl(a b) */ rownum as id , 2*rownum id1 , rpad('x', 100) as filler from (select /*+ cardinality(1e5) */ * from dual connect by level <= 2e5) a, (select /*+ cardinality(20) */ * from dual connect by level <= 20) b ; create table t2 compress as select rownum as id , case when rownum <= 1e5 then 1 when rownum <= 5e5 then 2 else rownum end as fk_id_skew , rownum as fk_id_uniform , case when rownum <= 5e5 then mod(rownum, 2e6) + 1 else null end as fk_id_sparse , rpad('x', 100) as filler from (select /*+ cardinality(1e5) */ * from dual connect by level <= 2e5) a, (select /*+ cardinality(20) */ * from dual connect by level <= 20) b ; create table t3 compress as select rownum as id ,4*rownum id3 , rpad('x', 100) as filler from (select /*+ cardinality(1e5) */ * from dual connect by level <= 2e5) a, (select /*+ cardinality(20) */ * from dual connect by level <= 20) b ;
Firstly, let`s see the following query and explain plan when OFE = 12.2.0.1 as:
alter session set optimizer_features_enable='12.2.0.1'; explain plan for select count(t_2_filler), count(t_3_filler) from ( select /*+ monitor no_merge parallel(4) leading(t2 t1) */ t1.id, t1.filler t1f1,t2.id,t2.filler t2f1 , regexp_replace(t2.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') || regexp_replace(t2.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') as t_2_filler , regexp_replace(t3.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') || regexp_replace(t3.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') as t_3_filler from t2 left outer join t1 on t1.id=t2.fk_id_sparse left outer join t3 on t1.id1=t3.id3 ) ; (Q1) select * from table(dbms_xplan.display(null,null,null)); ---------------------------------------------------------- Plan hash value: 614020325 -------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | -------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4004 | 2235 (2)| 00:00:01 | | | | | 1 | SORT AGGREGATE | | 1 | 4004 | | | | | | | 2 | PX COORDINATOR | | | | | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10004 | 1 | 4004 | | | Q1,04 | P->S | QC (RAND) | | 4 | SORT AGGREGATE | | 1 | 4004 | | | Q1,04 | PCWP | | | 5 | VIEW | | 4008K| 14G| 2235 (2)| 00:00:01 | Q1,04 | PCWP | | |* 6 | HASH JOIN RIGHT OUTER | | 4008K| 848M| 2235 (2)| 00:00:01 | Q1,04 | PCWP | | | 7 | PX RECEIVE | | 4000K| 408M| 687 (1)| 00:00:01 | Q1,04 | PCWP | | | 8 | PX SEND HASH | :TQ10002 | 4000K| 408M| 687 (1)| 00:00:01 | Q1,02 | P->P | HASH | | 9 | PX BLOCK ITERATOR | | 4000K| 408M| 687 (1)| 00:00:01 | Q1,02 | PCWC | | | 10 | TABLE ACCESS FULL | T3 | 4000K| 408M| 687 (1)| 00:00:01 | Q1,02 | PCWP | | | 11 | PX RECEIVE | | 4000K| 438M| 1542 (2)| 00:00:01 | Q1,04 | PCWP | | | 12 | PX SEND HASH | :TQ10003 | 4000K| 438M| 1542 (2)| 00:00:01 | Q1,03 | P->P | HASH | |* 13 | HASH JOIN OUTER BUFFERED| | 4000K| 438M| 1542 (2)| 00:00:01 | Q1,03 | PCWP | | | 14 | PX RECEIVE | | 4000K| 392M| 850 (1)| 00:00:01 | Q1,03 | PCWP | | | 15 | PX SEND HASH | :TQ10000 | 4000K| 392M| 850 (1)| 00:00:01 | Q1,00 | P->P | HASH | | 16 | PX BLOCK ITERATOR | | 4000K| 392M| 850 (1)| 00:00:01 | Q1,00 | PCWC | | | 17 | TABLE ACCESS FULL | T2 | 4000K| 392M| 850 (1)| 00:00:01 | Q1,00 | PCWP | | | 18 | PX RECEIVE | | 4000K| 45M| 685 (1)| 00:00:01 | Q1,03 | PCWP | | | 19 | PX SEND HASH | :TQ10001 | 4000K| 45M| 685 (1)| 00:00:01 | Q1,01 | P->P | HASH | | 20 | PX BLOCK ITERATOR | | 4000K| 45M| 685 (1)| 00:00:01 | Q1,01 | PCWC | | | 21 | TABLE ACCESS FULL | T1 | 4000K| 45M| 685 (1)| 00:00:01 | Q1,01 | PCWP | | -------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 6 - access("T1"."ID1"="T3"."ID3"(+)) 13 - access("T1"."ID"(+)="T2"."FK_ID_SPARSE") Note ----- - Degree of Parallelism is 4 because of hint
In my test system it took 103 second to complete. And If we look at the V$PQ_TQSTAT view:
DFO_NUMBER TQ_ID SERVER_TYPE INSTANCE PROCESS NUM_ROWS BYTES ROW_SHARE DATA_SHARE WAITS TIMEOUTS AVG_LATENCY ---------- ---------- --------------- -------- --------------- ---------- ---------- --------- ---------- ---------- ---------- ----------- 1 0 Producer 1 P000 972383 101673907 24.31 24.29 21 0 0 1 P001 940880 98214810 23.52 23.46 25 2 0 1 P002 1018596 106684695 25.46 25.49 23 1 0 1 P003 1068141 112036474 26.70 26.76 29 5 0 Consumer 1 P004 124986 13514973 3.12 3.23 1903 1899 0 1 P005 124403 13451856 3.11 3.21 1903 1899 0 1 P006 125324 13551533 3.13 3.24 1903 1899 0 1 P007 3625287 378091524 90.63 90.32 1903 1899 0 1 Producer 1 P000 968234 13352477 24.21 24.52 20 1 0 1 P001 1020964 13665164 25.52 25.10 24 2 0 1 P002 972210 13338121 24.31 24.50 23 5 0 1 P003 1038592 14089338 25.96 25.88 23 2 0 Consumer 1 P004 1000886 13623627 25.02 25.02 2713 2046 0 1 P005 998520 13592045 24.96 24.96 2710 2053 0 1 P006 1000610 13619103 25.02 25.01 2735 2069 0 1 P007 999984 13610325 25.00 25.00 2080 1985 0 2 Producer 1 P004 970775 105866492 24.27 24.27 26 4 0 1 P005 993664 108303963 24.84 24.83 24 5 0 1 P006 1027726 112089217 25.69 25.69 27 5 0 1 P007 1007835 109977478 25.20 25.21 26 6 0 Consumer 1 P000 1001034 109172059 25.03 25.03 1964 1913 0 1 P001 1000896 109157355 25.02 25.02 1971 1915 0 1 P002 1000071 109067176 25.00 25.00 1966 1915 0 1 P003 997999 108840560 24.95 24.95 1972 1918 0 3 Producer 1 P004 124986 13514946 3.12 3.23 0 0 0 1 P005 124403 13451852 3.11 3.21 0 0 0 1 P006 125324 13551519 3.13 3.24 0 0 0 1 P007 3625287 378091546 90.63 90.32 0 0 0 Consumer 1 P000 125012 13517675 3.13 3.23 3256 2469 0 1 P001 124715 13485635 3.12 3.22 4159 2819 0 1 P002 124813 13496259 3.12 3.22 4288 2887 0 1 P003 3625460 378110294 90.64 90.33 3669 2649 0 4 Producer 1 P000 1 44 25.00 25.00 2003 968 0 1 P001 1 44 25.00 25.00 2765 1285 0 1 P002 1 44 25.00 25.00 3010 1399 0 1 P003 1 44 25.00 25.00 1748 752 0 Consumer 1 QC 4 176 100.00 100.00 6 0 0
As you see there is a significant skew between PX slaves due to initial and null values caused by outer join. But if we change the OFE to 18.1.0 and hinted as below then we will get the following plan:
select count(t_2_filler), count(t_3_filler) from ( select /*+ monitor no_merge parallel(4) leading(t2 t1) no_swap_join_inputs(T3) */ t1.id, t1.filler t1f1,t2.id,t2.filler t2f1 , regexp_replace(t2.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') || regexp_replace(t2.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') as t_2_filler , regexp_replace(t3.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') || regexp_replace(t3.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') as t_3_filler from t2 left outer join t1 on t1.id=t2.fk_id_sparse left outer join t3 on t1.id1=t3.id3 ) ; (Q2) select * from table(dbms_xplan.display(null,null,null)); Plan hash value: 3676694016 ------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 4004 | 2235 (2)| 00:00:01 | | | | | 1 | SORT AGGREGATE | | 1 | 4004 | | | | | | | 2 | PX COORDINATOR | | | | | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10004 | 1 | 4004 | | | Q1,04 | P->S | QC (RAND) | | 4 | SORT AGGREGATE | | 1 | 4004 | | | Q1,04 | PCWP | | | 5 | VIEW | | 4008K| 14G| 2235 (2)| 00:00:01 | Q1,04 | PCWP | | |* 6 | HASH JOIN OUTER | | 4008K| 848M| 2235 (2)| 00:00:01 | Q1,04 | PCWP | | | 7 | PX RECEIVE | | 4000K| 438M| 1542 (2)| 00:00:01 | Q1,04 | PCWP | | | 8 | PX SEND HASH (NULL RANDOM) | :TQ10002 | 4000K| 438M| 1542 (2)| 00:00:01 | Q1,02 | P->P | HASH | |* 9 | HASH JOIN OUTER BUFFERED | | 4000K| 438M| 1542 (2)| 00:00:01 | Q1,02 | PCWP | | | 10 | PX RECEIVE | | 4000K| 392M| 850 (1)| 00:00:01 | Q1,02 | PCWP | | | 11 | PX SEND HASH (NULL RANDOM)| :TQ10000 | 4000K| 392M| 850 (1)| 00:00:01 | Q1,00 | P->P | HASH | | 12 | PX BLOCK ITERATOR | | 4000K| 392M| 850 (1)| 00:00:01 | Q1,00 | PCWC | | | 13 | TABLE ACCESS FULL | T2 | 4000K| 392M| 850 (1)| 00:00:01 | Q1,00 | PCWP | | | 14 | PX RECEIVE | | 4000K| 45M| 685 (1)| 00:00:01 | Q1,02 | PCWP | | | 15 | PX SEND HASH | :TQ10001 | 4000K| 45M| 685 (1)| 00:00:01 | Q1,01 | P->P | HASH | | 16 | PX BLOCK ITERATOR | | 4000K| 45M| 685 (1)| 00:00:01 | Q1,01 | PCWC | | | 17 | TABLE ACCESS FULL | T1 | 4000K| 45M| 685 (1)| 00:00:01 | Q1,01 | PCWP | | | 18 | PX RECEIVE | | 4000K| 408M| 687 (1)| 00:00:01 | Q1,04 | PCWP | | | 19 | PX SEND HASH | :TQ10003 | 4000K| 408M| 687 (1)| 00:00:01 | Q1,03 | P->P | HASH | | 20 | PX BLOCK ITERATOR | | 4000K| 408M| 687 (1)| 00:00:01 | Q1,03 | PCWC | | | 21 | TABLE ACCESS FULL | T3 | 4000K| 408M| 687 (1)| 00:00:01 | Q1,03 | PCWP | | ------------------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 6 - access("T1"."ID1"="T3"."ID3"(+)) 9 - access("T1"."ID"(+)="T2"."FK_ID_SPARSE") Note ----- - Degree of Parallelism is 4 because of hint
It is seen that there is a new (modified) version of true HASH distribution (not hybrid) in the plan shape. The no_swap_join_inputs hint is not necessary to be specified here, If we remove the hint, we will get HASH JOIN RIGHT OUTER instead of HASH JOIN OUTER. The key point here is that “Preserved” rowsource (initially it can be a table, but latter can be a result of join) of the join will be checked and the NULL values are distributed RANDOMLY in order to attain uniform distribution, in other words to avoid the join skew. The plan lines 11 and 8 indicate that NULL values accordingly consumed from T2 table and the first outer join will be distributed randomly (other values will be distributed via HASH distribution)
. The query (Q2) took 49 second to complete in my test system.
Let’s display the result of v$PQ_TQSTAT:
DFO_NUMBER TQ_ID SERVER_TYPE INSTANCE PROCESS NUM_ROWS BYTES ROW_SHARE DATA_SHARE WAITS TIMEOUTS AVG_LATENCY ---------- ---------- --------------- -------- --------------- ---------- ---------- --------- ---------- ---------- ---------- ----------- 1 0 Producer 1 P000 973412 101972494 24.34 24.36 32 11 0 1 P001 1029942 108261300 25.75 25.86 32 8 0 1 P002 1013635 105773077 25.34 25.27 32 10 0 1 P003 983011 102602967 24.58 24.51 31 9 0 Consumer 1 P004 999987 104651085 25.00 25.00 115 111 0 1 P005 999403 104587864 24.99 24.98 116 112 0 1 P006 1000324 104687541 25.01 25.01 115 111 0 1 P007 1000286 104683348 25.01 25.01 116 112 0 1 Producer 1 P000 975402 13325594 24.39 24.48 22 1 0 1 P001 1007163 13717028 25.18 25.19 19 1 0 1 P002 1048553 14051184 26.21 25.81 23 2 0 1 P003 968882 13351294 24.22 24.52 20 3 0 Consumer 1 P004 1000886 13623627 25.02 25.02 3118 1458 0 1 P005 998520 13592045 24.96 24.96 2213 1077 0 1 P006 1000610 13619103 25.02 25.01 3460 1582 0 1 P007 999984 13610325 25.00 25.00 3900 1809 0 2 Producer 1 P004 999987 104651082 25.00 25.00 0 0 0 1 P005 999403 104587812 24.99 24.98 0 0 0 1 P006 1000324 104687551 25.01 25.01 0 0 0 1 P007 1000286 104683370 25.01 25.01 0 0 0 Consumer 1 P000 1000013 104653811 25.00 25.00 175 125 0 1 P001 999715 104621619 24.99 24.99 171 121 0 1 P002 999813 104632267 25.00 25.00 175 124 0 1 P003 1000459 104702118 25.01 25.01 170 123 0 3 Producer 1 P004 1042597 113581472 26.06 26.04 24 4 0 1 P005 972928 106168339 24.32 24.34 22 5 0 1 P006 1011547 110318955 25.29 25.29 23 4 0 1 P007 972928 106168360 24.32 24.34 23 2 0 Consumer 1 P000 1001034 109172059 25.03 25.03 841 452 0 1 P001 1000896 109157355 25.02 25.02 3606 1840 0 1 P002 1000071 109067176 25.00 25.00 3849 1772 0 1 P003 997999 108840536 24.95 24.95 3844 1792 0 4 Producer 1 P000 1 44 25.00 25.00 2827 1263 0 1 P001 1 44 25.00 25.00 6032 2857 0 1 P002 1 44 25.00 25.00 5604 2524 0 1 P003 1 44 25.00 25.00 5728 2596 0 Consumer 1 QC 4 176 100.00 100.00 103 99 0
As you see there is not skew anymore and the query completed very fast than previous one. The automatic null value handling for outer join is controlled by _px_join_skew_null_handling parameter and its default value is true.
Leave a comment