Thursday, November 28, 2013

INSERT ALL

During a performance inspection of some E-Business Suite batch programs running on 11.2.0.3, I noticed almost 99% of the consistent reads missing in the tkprof file for a particular insert statement. It was a giant INSERT ALL statement where the accompanying select statement contained several subqueries in a "case when exists (query)" construct. And those subqueries in the select clause was what was missing from the tkprof file. I've seen this behaviour before in version 9 databases in regular SELECT statements. But since version 10 this anomaly was resolved.

This is my reproduction of the scenario on 11.2.0.3: a small table containing 100 rows, a larger table containing 100,000 rows, and two new tables where the rows are copied into.

SQL> create table small_table (id,name)
  2  as
  3   select level
  4        , 'Name ' || to_char(level)
  5     from dual
  6  connect by level <= 100
  7  /
 
Table created.
 
SQL> create table large_table (id,name)
  2  as
  3   select level
  4        , 'Name ' || to_char(level)
  5     from dual
  6  connect by level <= 100000
  7  /
 
Table created.
 
SQL> create table new_table_1
  2  ( id         number
  3  , name       varchar2(11)
  4  , average_id number
  5  )
  6  /
 
Table created.
 
SQL> create table new_table_2
  2  ( id         number
  3  , name       varchar2(11)
  4  , average_id number
  5  )
  6  /
 
Table created.
 
SQL> begin
  2    dbms_stats.gather_table_stats(user,'small_table');
  3    dbms_stats.gather_table_stats(user,'large_table');
  4    dbms_stats.gather_table_stats(user,'new_table_1');
  5    dbms_stats.gather_table_stats(user,'new_table_2');
  6  end;
  7  /
 
PL/SQL procedure successfully completed.

For each of the 100 rows of the small_table, I'm selecting the average ID of 1000 rows in the large_table, as a scalar subquery.
The execution plan shows 100 starts of the scalar subquery on large_table. And we can see that it took 3.24 seconds.

SQL> set serveroutput off
SQL> alter session set statistics_level = all
  2  /
 
Session altered.
 
SQL> select s.id
  2       , s.name
  3       , ( select avg(l.id)
  4             from large_table l
  5            where ceil(l.id / 1000) = s.id
  6         ) avg_id
  7    from small_table s
  8  /
 
        ID NAME                                              AVG_ID
---------- --------------------------------------------- ----------
         1 Name 1                                             500,5
         2 Name 2                                            1500,5
         3 Name 3                                            2500,5
...[95 lines removed]...
        99 Name 99                                          98500,5
       100 Name 100                                         99500,5
 
100 rows selected.
 
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------
SQL_ID  7167wkgunj23z, child number 0
-------------------------------------
select s.id      , s.name      , ( select avg(l.id)            from
large_table l           where ceil(l.id / 1000) = s.id        ) avg_id
from small_table s
 
Plan hash value: 1885542926
 
--------------------------------------------------------------------------------------------
| Id  | Operation          | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |      1 |        |    100 |00:00:00.01 |       9 |
|   1 |  SORT AGGREGATE    |             |    100 |      1 |    100 |00:00:03.28 |   58900 |
|*  2 |   TABLE ACCESS FULL| LARGE_TABLE |    100 |   1000 |    100K|00:00:03.24 |   58900 |
|   3 |  TABLE ACCESS FULL | SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       9 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(CEIL("L"."ID"/1000)=:B1)
 
 
22 rows selected.

A full table scan against small_table and for each row of small_table (100) a full table scan against large_table. Now, let's see what the plan looks like when the same query is wrapped up in an INSERT SELECT statement.

SQL> insert into new_table_1
  2  select s.id
  3       , s.name
  4       , ( select avg(l.id)
  5             from large_table l
  6            where ceil(l.id / 1000) = s.id
  7         ) avg_id
  8    from small_table s
  9  /
 
100 rows created.
 
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dfg734y8ym7vk, child number 0
-------------------------------------
insert into new_table_1 select s.id      , s.name      , ( select
avg(l.id)            from large_table l           where ceil(l.id /
1000) = s.id        ) avg_id   from small_table s
 
Plan hash value: 1885542926
 
--------------------------------------------------------------------------------------------------
| Id  | Operation                | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |             |      1 |        |      0 |00:00:03.25 |   58942 |
|   1 |  LOAD TABLE CONVENTIONAL |             |      1 |        |      0 |00:00:03.25 |   58942 |
|   2 |   SORT AGGREGATE         |             |    100 |      1 |    100 |00:00:03.25 |   58900 |
|*  3 |    TABLE ACCESS FULL     | LARGE_TABLE |    100 |   1000 |    100K|00:00:03.21 |   58900 |
|   4 |   TABLE ACCESS FULL      | SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       2 |
--------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter(CEIL("L"."ID"/1000)=:B1)
 
 
23 rows selected.

An additional LOAD TABLE CONVENTIONAL step, and the selection against large_table still visible. Next, I'm going to use the same query with an INSERT ALL statement.

SQL> rollback
  2  /
 
Rollback complete.
 
SQL> insert all
  2    when mod(id,2) = 0 then into new_table_1
  3    when mod(id,2) = 1 then into new_table_2
  4  select s.id
  5       , s.name
  6       , ( select avg(l.id)
  7             from large_table l
  8            where ceil(l.id / 1000) = s.id
  9         ) avg_id
10    from small_table s
11  /
 
100 rows created.
 
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2m5kj5cvf1jx1, child number 0
-------------------------------------
insert all   when mod(id,2) = 0 then into new_table_1   when mod(id,2)
= 1 then into new_table_2 select s.id      , s.name      , ( select
avg(l.id)            from large_table l           where ceil(l.id /
1000) = s.id        ) avg_id   from small_table s
 
Plan hash value: 1492625026
 
---------------------------------------------------------------------------------------------
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT    |             |      1 |        |      0 |00:00:03.27 |   58947 |
|   1 |  MULTI-TABLE INSERT |             |      1 |        |      0 |00:00:03.27 |   58947 |
|   2 |   TABLE ACCESS FULL | SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       2 |
|   3 |   INTO              | NEW_TABLE_1 |      0 |        |      0 |00:00:00.01 |       0 |
|   4 |   INTO              | NEW_TABLE_2 |      0 |        |      0 |00:00:00.01 |       0 |
---------------------------------------------------------------------------------------------
 
 
19 rows selected.

And look: the selections against large_table have disappeared. And it looks like the insertion is what takes 3.27 seconds. But the MULTI-TABLE INSERT step is just accumulating the time and buffers from the steps below, visible and invisible.

I repeated this test on 12.1.0.1. The optimizer in this version has improved, as can be seen by this plan for the exact same statement:

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT      |             |      1 |        |      0 |00:00:00.09 |     304 |       |       |          |
|   1 |  MULTI-TABLE INSERT   |             |      1 |        |      0 |00:00:00.09 |     304 |       |       |          |
|*  2 |   HASH JOIN OUTER     |             |      1 |    100K|    100 |00:00:00.09 |     294 |  1696K|  1696K| 1491K (0)|
|   3 |    TABLE ACCESS FULL  | SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
|   4 |    VIEW               | VW_SSQ_1    |      1 |    100K|    100 |00:00:00.09 |     291 |       |       |          |
|   5 |     SORT GROUP BY     |             |      1 |    100K|    100 |00:00:00.09 |     291 | 73728 | 73728 |          |
|   6 |      TABLE ACCESS FULL| LARGE_TABLE |      1 |    100K|    100K|00:00:00.03 |     291 |       |       |          |
|   7 |   INTO                | NEW_TABLE_1 |      0 |        |      0 |00:00:00.01 |       0 |       |       |          |
|   8 |   INTO                | NEW_TABLE_2 |      0 |        |      0 |00:00:00.01 |       0 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------

A much better plan: no more 100 starts of the query against large_table, but one nice HASH JOIN OUTER. However, the goal here is not to have the most efficient execution plan, but a plan equal to the 11.2.0.3 plan, that's why I added the optimizer hint NO_QUERY_TRANSFORMATION. And this is what the INSERT ALL looks like in 12.1.0.1:

SQL> insert all
  2    when mod(id,2) = 0 then into new_table_1
  3    when mod(id,2) = 1 then into new_table_2
  4  select /*+ no_query_transformation */ s.id
  5       , s.name
  6       , ( select avg(l.id)
  7             from large_table l
  8            where ceil(l.id / 1000) = s.id
  9         ) avg_id
 10    from small_table s
 11  /

100 rows created.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  atb16ym82sygz, child number 0
-------------------------------------
insert all   when mod(id,2) = 0 then into new_table_1   when mod(id,2)
= 1 then into new_table_2 select /*+ no_query_transformation */ s.id
  , s.name      , ( select avg(l.id)            from large_table l
     where ceil(l.id / 1000) = s.id        ) avg_id   from small_table s

Plan hash value: 220198846

---------------------------------------------------------------------------------------------
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT    |             |      1 |        |      0 |00:00:02.03 |   29113 |
|   1 |  MULTI-TABLE INSERT |             |      1 |        |      0 |00:00:02.03 |   29113 |
|   2 |   SORT AGGREGATE    |             |    100 |      1 |    100 |00:00:02.03 |   29100 |
|*  3 |    TABLE ACCESS FULL| LARGE_TABLE |    100 |   1000 |    100K|00:00:02.00 |   29100 |
|   4 |   VIEW              |             |      1 |    100 |    100 |00:00:02.03 |   29103 |
|   5 |    TABLE ACCESS FULL| SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       3 |
|   6 |   INTO              | NEW_TABLE_1 |      0 |        |      0 |00:00:00.01 |       0 |
|   7 |   INTO              | NEW_TABLE_2 |      0 |        |      0 |00:00:00.01 |       0 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(CEIL("L"."ID"/1000)=:B1)

27 rows selected.

So in 12.1.0.1 the scalar subqueries against large_table are visible again. Problem solved, unless you're not on 12c yet.

2 comments:

  1. Rob,

    Always a fan of your posts. Thanks for sharing that.

    ReplyDelete
  2. Really something Grate in this article Thanks for sharing this. We are providing ORACLE courses training online. After reading this slightly am changed my way of introduction about my training to people. And also refer my website for ORACLE Training and solutions of ORACLE applications. Please Visit Us @ ORACLE training courses online

    ReplyDelete