Chinar Aliyev`s blog

May 10, 2018

SQL Tuning approach and a Real life example

Filed under: Rewriting,Tuning — Chinar Aliyev @ 10:27 am

As in other optimization problems, SQL Tuning task (STT) also requires to find an “optimal path” (OP) to reach the goal. OP depends on a lot of factors like where are we right now (starting point – A), what are we given? (The initial conditions), where do we need to reach? or what result is required to attain (End point B)? Depending the initial conditions and the required result, to find OP could very sophisticate process.

To solve a STT, firstly, we need to understand business process that gives us an opportunity and increases tuning approaches. Also, we need to understand which optimization techniques exists, which of them were implemented in commercial DBMS and how we can take advantage of them. According to implemented techniques and initial conditions the query optimizer (QO) will generate the efficient execution plan (EP).

Recently, I have been asked to tune an SQL. Let`s see the following SQL statement. Oracle database version is 11.2.0.4 and OS is Red Hat EL 6.7. The original SQL is quite complicate, I have removed some parts of the SQL but its equivalent looks like:

SELECT *
  FROM (SELECT  
         Ah.TRN_NO
        ,Ah.TRN_DATE
        ...
        ,ROW_NUMBER() Over(ORDER BY Tb.TRN_DATE DESC) Rn
          FROM (SELECT Ah.TRN_NO
                      ,Ah.TRN_DATE
                      ...
                      FROM transaction_history AH
                      ,(SELECT t.BR
                              ,t.Ac_No
                              ,t.C
                          FROM Accounts t
                         WHERE t.Cust_No = 'SSSSS' – a given Customer number
                           AND t.A_St = 'A'
                           AND t.R_St = 'O'
                           AND t.Type = 'U') Ca
                 WHERE Ah.Event <> 'REVL'
                   AND Ah.Br = Ca.Br
                   AND Ah.Ac_No = Ca.Ac_No
                   AND Ah.c = Ca.c
               ) Tb) d
WHERE d.Rn <= 15
ORDER BY d.Trn_Date DESC;

There are two tables, ACCOUNTS table contains customer`s account numbers and other information. For every client there can be several accounts in that table. Transactions for each account are kept in TRANSACTION_HISTORY table. For each time to execute the query we are given customer`s number that uniquely identifies the customer.
The above SQL statement tells us that it was written to find last fifteen transactions of the given customers (to show last fifteen transactions when application is opened). Now let`s see the execution plan of the statement. Before execution the query, I have flushed OS and the database`s buffer cache to see the exact performance statistics/problem.

                                                                                                                                                      
--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name                   | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                        |      1 |        |     15 |00:07:14.03 |   83945 |  76419 |       |       |          |
|*  1 |  VIEW                           |                        |      1 |     12 |     15 |00:07:14.03 |   83945 |  76419 |       |       |          |
|*  2 |   WINDOW SORT PUSHED RANK       |                        |      1 |     12 |     16 |00:07:14.03 |   83945 |  76419 |   160K|   160K|  142K (0)|
|   3 |    NESTED LOOPS                 |                        |      1 |     12 |    121K|00:07:13.54 |   83945 |  76419 |       |       |          |
|   4 |     NESTED LOOPS                |                        |      1 |     14 |    121K|00:00:04.39 |    1004 |    775 |       |       |          |
|*  5 |      TABLE ACCESS BY INDEX ROWID| ACCOUNTS               |      1 |      1 |     85 |00:00:00.81 |      84 |     84 |       |       |          |
|*  6 |       INDEX RANGE SCAN          | IX_ACC                 |      1 |      1 |     88 |00:00:00.06 |       4 |      4 |       |       |          |
|*  7 |      INDEX RANGE SCAN           | IX_TR_HISTORY          |     85 |     14 |    121K|00:00:03.53 |     920 |    691 |       |       |          |
|*  8 |     TABLE ACCESS BY INDEX ROWID | TRANSACTION_HISTORY    |    121K|    113 |    121K|00:07:08.97 |   82941 |  75644 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                                        
Predicate Information (identified by operation id):                                                                                                     
---------------------------------------------------                                                                                                     
                                                                                                                                                        
   1 - filter("D"."RN"<=15)                                                                                                                             
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("AH"."TRN_DT") DESC )<=15)                                                                 
   5 - filter(("T"."TYPE"='U' AND "T"."A_STAT"='A' AND "T"."R_STAT"='O'))                                                               
   6 - access("T"."CUST_NO"='SSSSSS')                                                                                                                  
   7 - access("AH"."AC_NO"="T"."AC_NO" AND "AH"."BR"="T"."BR")                                                                     
   8 - filter(("AH"."EVENT"<>'REVL' AND "AH"."C"="T"."C"))

As you see, 7.14 minute was spend to execute the SQL and the most of time was consumed to retrieve data from TRANSACTION_HISTORY the table. In the 8th line of the execution plan, the database spent 7.09 second, it means 7.9/7.14*100=99.30%, so almost all time was spent to scan TRANSACTION_HISTORY table. Why such a big time was required to retrieve rows from the transaction table?
It is obvious that the index IX_TR_HISTORY is not an unselective. According the execution plan, in the line 5 we see there are 85 accounts in the ACCCOUNT table for the given customer. For the each account the database scanned IX_TR_HISTORY index (so 85 times) and got 121K rows. That is why the line 8 was executed 121K times. But the retrieving the appropriate data from INDEXES were not consumed lots of time. What is the problem there and how can we solve it?

As you see from the 8th line of the execution plan, the database did 75644 physical reads in order to extract 120K rows from the table via index ROWID. Actually, a lot of physical reads required such amount of time. So what do we have to do now? How could the response time be reduced in this case?
As we noted before, there can be several account number for a customer and for each account there can be thousands and even millions of transactions. Do we have to scan all transactions to find out last 15 of them? Can we avoid scanning all transactions in this case?

Actually, as execution plan says, after join of two tables, the database performs SORT operation and selects last 15 transactions. We need to find a way to avoid complete scan of all rows (transactions of the accounts). My first assumption was using INDEX (walking DESCENDING order and stop after retrieving 15 entries) to find the latest transactions without reading all of them. In reality, we do not need to process all transactions of the account. How can we rewrite the original SQL to achieve that?

If we look at the data of our tables (above execution plan also tells us about the data) it can be seen that there are limited number of accounts (in our case it is 85), but there are lots of transactions. There is another question. How much data is located in ACCOUNTS table that customers have less than 10 accounts?

SQL> SELECT   ROUND (cnt10 / total * 100, 2) "less than 10 %"
      FROM   (SELECT   SUM (CASE WHEN cnt <= 10 THEN cnt ELSE 0 END) cnt10,
                   SUM (cnt) total
            FROM   (  SELECT                               /*+ parallel(10) */
                            cust_no, COUNT ( * ) cnt
                        FROM   accounts
                       WHERE   a_stat = 'A' AND r_stat = 'O'
                    GROUP BY   cust_no));  

less than 10 %
--------------
         98.94

SQL>

So approximately 99% table data consumed by the customers that have less than 10 number of accounts, it gives me an idea to do next step.
If we talk about joining of two tables here, to get the ORDERED data, we have to join all rows, because there is no guarantee that the join of INDEX RANGE SCAN DESCENDING/TABLE ACCESS BY INDEX ROWID operators will give us ORDERED result and then we can limit it by ROWNUM clause or analytic function. In principle, as we see there are limited number of accounts per customers (mostly ten accounts per customers), but the transactions per account could increase over the time passed like thousands or millions. That why I decided to write a procedure/function to generate simple SQL that provide the result I want, then I can open cursor for that generated SQL and can fetch required data.

DECLARE
    custno IN VARCHAR2(100):='SSSS'; -- this is a given customer number.
    mysql   CLOB; 
    v_cur     SYS_REFCURSOR;
BEGIN
    FOR c
    IN (SELECT t.BR
                    ,t.ACC_NO
                    ,t.C
                FROM Accounts  t
               WHERE t.Cust_No = custno
                 AND t.A_Stat = 'A'
                 AND t.R_Stat = 'O'
                 AND t.Type = 'U')
    LOOP
        IF mysql is null then
        THEN
            mysql :=
                'select * from (SELECT /*+ index_desc ah IX_TR_HISTORY) */ 
                       Ah.TRN_NO
                      ,Ah.TRN_DATE
                      ...
                       FROM  transaction_history AH WHERE Ah.Event <> '
                || CHR (39)|| 'REVL'|| CHR (39)|| ' AND Ah.Ac_No ='
                || CHR (39)|| c.ac_no || CHR (39)|| ' and Ah.BR='
                || CHR (39)|| c.BR|| CHR (39)|| ' and  Ah.C='
                || CHR (39)|| c.c|| CHR (39)
                || ' order by TRN_DT desc)   where rownum<=15';
        ELSE
            mysql :=
                mysql || CHR (10) || ' union all ' || CHR (10)
                || 'select * from (SELECT /*+ index_desc(ah IX_TR_HISTORY) */ 
                       Ah.TRN_NO
                      ,Ah.TRN_DATE
                      ...
                       FROM  transaction_history AH WHERE Ah.Event <> '
                || CHR (39)|| 'REVL'|| CHR (39)|| ' AND Ah.Ac_No ='
                || CHR (39)|| c.ac_no|| CHR (39)|| ' and Ah.BR='
                || CHR (39)|| c.BR|| CHR (39)|| ' and  Ah.C='
                || CHR (39)|| c.c || CHR (39)
                || ' order by TRN_DT desc)   where rownum<=15';
        END IF;
    END LOOP;

    mysql :=
           'select * from (select * from ('
        || mysql
        || ') ORDER BY Trn_Dt DESC) where rownum<=15';
--insert into t values(MYSQL);
 OPEN v_cur FOR  mysql;
.....
close v_cur;
---select * from t;

END;

It just took less than half a second to complete.

What have we done here? We went each account number and produced simple SQLs that support exact INDEX RANGE SCAN DESCENDING operation which helps us to avoid sort operation and scanning unnecessary rows/transactions. And we have combined all SQL s with UNION ALL clause. As a result we have got following SQL statement.

SELECT   *
  FROM   (  SELECT   *
              FROM   (SELECT   *
                        FROM   (  SELECT   /*+ index_desc(ah IX_TR_HISTORY) */  
                                           Ah.TRN_NO
                                          ,Ah.TRN_DATE
                                          ...

                                    FROM   transaction_history  ah
                                   WHERE       ah.event <> 'REVL'
                                           AND ah.ac_no = 'QQQQQQQQQQQQQQQQQQQ' 
                                           AND ah.br = '110'
                                           AND ah.c = 'UUU'
                                ORDER BY   trn_date DESC)
                       WHERE   ROWNUM <= 15
                      UNION ALL
                      SELECT   *
                        FROM   (  SELECT   /*+ index_desc(ah IX_TR_HISTORY) */  
                                           Ah.TRN_NO
                                          ,Ah.TRN_DATE
                                          ...

                                    FROM   transaction_history  ah
                                   WHERE       ah.event <> 'REVL'
                                           AND ah.ac_no = 'AAAAAAAAAAAAAAAAAAA'
                                           AND ah.br = '110'
                                           AND ah.c = 'RRR'
                                ORDER BY   trn_date DESC)
                       WHERE   ROWNUM <= 15
                       UNION ALL
                      ...
                      SELECT   *
                        FROM   (  SELECT   /*+ index_desc(ah IX_TR_HISTORY) */  
                                           Ah.TRN_NO
                                          ,Ah.TRN_DATE
                                          ...

                                    FROM   transaction_history  ah
                                   WHERE       ah.event <> 'REVL'
                                           AND ah.ac_no = 'DDDDDDDDDDDDDDDDDDD' 
                                           AND ah.br = '154'
                                           AND ah.c = 'AAA'
                                ORDER BY   trn_date DESC)
                       WHERE   ROWNUM <= 15)
          ORDER BY   trn_date DESC)
 WHERE   ROWNUM <= 15;

And its execution plan

--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                          |      1 |        |     15 |00:00:12.82 |    2355 |   2166 |       |       |          |
|*  1 |  COUNT STOPKEY                      |                          |      1 |        |     15 |00:00:12.82 |    2355 |   2166 |       |       |          |
|   2 |   VIEW                              |                          |      1 |    139 |     15 |00:00:12.82 |    2355 |   2166 |       |       |          |
|*  3 |    SORT ORDER BY STOPKEY            |                          |      1 |    139 |     15 |00:00:12.82 |    2355 |   2166 |  4096 |  4096 | 4096  (0)|
|   4 |     VIEW                            |                          |      1 |    139 |    977 |00:00:12.82 |    2355 |   2166 |       |       |          |
|   5 |      UNION-ALL                      |                          |      1 |        |    977 |00:00:12.82 |    2355 |   2166 |       |       |          |
|*  6 |       COUNT STOPKEY                 |                          |      1 |        |      2 |00:00:01.27 |     168 |    175 |       |       |          |
|   7 |        VIEW                         |                          |      1 |      1 |      2 |00:00:01.27 |     168 |    175 |       |       |          |
|*  8 |         TABLE ACCESS BY INDEX ROWID | TRANSACTION_HISTORY      |      1 |      1 |      2 |00:00:01.27 |     168 |    175 |       |       |          |
|*  9 |          INDEX RANGE SCAN DESCENDING| IX_TR_HISTORY            |      1 |      1 |    163 |00:00:00.08 |       6 |     13 |       |       |          |
|* 10 |       COUNT STOPKEY                 |                          |      1 |        |      1 |00:00:00.01 |       5 |      2 |       |       |          |
|  11 |        VIEW                         |                          |      1 |      1 |      1 |00:00:00.01 |       5 |      2 |       |       |          |
|* 12 |         TABLE ACCESS BY INDEX ROWID | TRANSACTION_HISTORY      |      1 |      1 |      1 |00:00:00.01 |       5 |      2 |       |       |          |
|* 13 |          INDEX RANGE SCAN DESCENDING| IX_TR_HISTORY            |      1 |      1 |      1 |00:00:00.01 |       4 |      2 |       |       |          |
|* 14 |       COUNT STOPKEY                 |                          |      1 |        |     15 |00:00:00.06 |      17 |     13 |       |       |          |
...
|*342 |       COUNT STOPKEY                 |                          |      1 |        |     15 |00:00:00.10 |      19 |     15 |       |       |          |
| 343 |        VIEW                         |                          |      1 |      1 |     15 |00:00:00.10 |      19 |     15 |       |       |          |
|*344 |         TABLE ACCESS BY INDEX ROWID | TRANSACTION_HISTORY      |      1 |      1 |     15 |00:00:00.10 |      19 |     15 |       |       |          |
|*345 |          INDEX RANGE SCAN DESCENDING| IX_TR_HISTORY            |      1 |      1 |     15 |00:00:00.01 |       4 |      0 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                                       
Predicate Information (identified by operation id):                                                                                                    
---------------------------------------------------                                                                                                    
                                                                                                                                                       
   1 - filter(ROWNUM<=15)                                                                                                                              
   3 - filter(ROWNUM<=15)                                                                                                                              
   6 - filter(ROWNUM<=15)                                                                                                                              
   8 - filter(("AH"."C"='UUU' AND "AH"."EVENT"<>'REVL'))                                                                                          
   9 - access("AH"."AC_NO"='QQQQQQQQQQQQQQQQQQQ' AND "AH"."BR"='110')                                                                          
  10 - filter(ROWNUM<=15)                                                                                                                              
  12 - filter(("AH"."C"='RRR' AND "AH"."EVENT"<>'REVL'))                                                                                          
  13 - access("AH"."AC_NO"='AAAAAAAAAAAAAAAAAAA' AND "AH"."BR"='110')                                                                          
  14 - filter(ROWNUM<=15)                                                                                                                              
  ....
 341 - access("AH"."AC_NO"='EEEEEEEEEEEEEEEEEEE' AND "AH"."BR"='167')                                                                          
 342 - filter(ROWNUM<=15)                                                                                                                              
 344 - filter(("AH"."C"='AAA' AND "AH"."EVENT"<>'REVL'))                                                                                          
 345 - access("AH"."AC_NO"='DDDDDDDDDDDDDDDDDDD' AND "AH"."BR"='154')                                                                          

So, in this case we have only scanned 977 rows instead of 121K in order to get last fifteen transactions. Also, the database did 2166 physical request instead of 76419. And the execution time now is 13 second instead of 7.14 minute. As a result we have got 7.14*60/13=33 times faster query. Just remember that in both test cases I have flushed database`s and Linux cache to compare the performance results.

I want to note that the INDEX_DESC hint should be explicitly set in the SQL, because there could be cases the cost of INDEX RANGE scan operation could be equal (or less than) to that of INDEX RANGE SCAN DESCENDING operation and oracle favors to choice the previous ones. And in this case you would see additional SORT ORDER BY STOPKEY optimizer operation in the execution plan.

Summary

In SQL Tuning task, generally we try to find an execution plan/path that gives the database to proceed minimum rows. It is the goal of SQL tuning and query optimizer always try to find such a plan. All optimization techniques (query transformation, join order, access paths) are used by optimizer are designed to find a way to minimize row processing and send the next step of execution plan minimum rows early as much as possible. But sometimes we can approach to the task with quite different ways like rewriting SQL, execute different parts of the given SQL separately and join all small parts together in order to produce the final result with minimum rows processing.

Advertisements

2 Comments »

  1. One concern I have here: If you have any reason at all not to trust the data in the Accounts table, then this is subject to SQL injection attacks. There should be some sort of either validating the data in the table when you use it, or other sanitization (like with DBMS_ASSERT) to escape any bad characters that might be in the data.

    (I realize you meant this as just a demo of your technique, but SQL injection is important enough that I think it’s worth at least mentioning.)

    Comment by Jason B. — May 11, 2018 @ 2:28 pm | Reply

  2. Hi Jason B.

    Thanks for the comment, ofcourse it is just a demo to demonstrate the case and its solution. I was not going to talk about security issues here )).

    Comment by Chinar Aliyev — May 14, 2018 @ 5:29 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: