Thursday, November 28, 2013

INSERT ALL

During a performance inspection of some E-Business Suite batch programs running on 11.2.0.3, I noticed almost 99% of the consistent reads missing in the tkprof file for a particular insert statement. It was a giant INSERT ALL statement where the accompanying select statement contained several subqueries in a "case when exists (query)" construct. And those subqueries in the select clause was what was missing from the tkprof file. I've seen this behaviour before in version 9 databases in regular SELECT statements. But since version 10 this anomaly was resolved.

This is my reproduction of the scenario on 11.2.0.3: a small table containing 100 rows, a larger table containing 100,000 rows, and two new tables where the rows are copied into.

SQL> create table small_table (id,name)
  2  as
  3   select level
  4        , 'Name ' || to_char(level)
  5     from dual
  6  connect by level <= 100
  7  /
 
Table created.
 
SQL> create table large_table (id,name)
  2  as
  3   select level
  4        , 'Name ' || to_char(level)
  5     from dual
  6  connect by level <= 100000
  7  /
 
Table created.
 
SQL> create table new_table_1
  2  ( id         number
  3  , name       varchar2(11)
  4  , average_id number
  5  )
  6  /
 
Table created.
 
SQL> create table new_table_2
  2  ( id         number
  3  , name       varchar2(11)
  4  , average_id number
  5  )
  6  /
 
Table created.
 
SQL> begin
  2    dbms_stats.gather_table_stats(user,'small_table');
  3    dbms_stats.gather_table_stats(user,'large_table');
  4    dbms_stats.gather_table_stats(user,'new_table_1');
  5    dbms_stats.gather_table_stats(user,'new_table_2');
  6  end;
  7  /
 
PL/SQL procedure successfully completed.

For each of the 100 rows of the small_table, I'm selecting the average ID of 1000 rows in the large_table, as a scalar subquery.
The execution plan shows 100 starts of the scalar subquery on large_table. And we can see that it took 3.24 seconds.

SQL> set serveroutput off
SQL> alter session set statistics_level = all
  2  /
 
Session altered.
 
SQL> select s.id
  2       , s.name
  3       , ( select avg(l.id)
  4             from large_table l
  5            where ceil(l.id / 1000) = s.id
  6         ) avg_id
  7    from small_table s
  8  /
 
        ID NAME                                              AVG_ID
---------- --------------------------------------------- ----------
         1 Name 1                                             500,5
         2 Name 2                                            1500,5
         3 Name 3                                            2500,5
...[95 lines removed]...
        99 Name 99                                          98500,5
       100 Name 100                                         99500,5
 
100 rows selected.
 
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------
SQL_ID  7167wkgunj23z, child number 0
-------------------------------------
select s.id      , s.name      , ( select avg(l.id)            from
large_table l           where ceil(l.id / 1000) = s.id        ) avg_id
from small_table s
 
Plan hash value: 1885542926
 
--------------------------------------------------------------------------------------------
| Id  | Operation          | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |      1 |        |    100 |00:00:00.01 |       9 |
|   1 |  SORT AGGREGATE    |             |    100 |      1 |    100 |00:00:03.28 |   58900 |
|*  2 |   TABLE ACCESS FULL| LARGE_TABLE |    100 |   1000 |    100K|00:00:03.24 |   58900 |
|   3 |  TABLE ACCESS FULL | SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       9 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(CEIL("L"."ID"/1000)=:B1)
 
 
22 rows selected.

A full table scan against small_table and for each row of small_table (100) a full table scan against large_table. Now, let's see what the plan looks like when the same query is wrapped up in an INSERT SELECT statement.

SQL> insert into new_table_1
  2  select s.id
  3       , s.name
  4       , ( select avg(l.id)
  5             from large_table l
  6            where ceil(l.id / 1000) = s.id
  7         ) avg_id
  8    from small_table s
  9  /
 
100 rows created.
 
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dfg734y8ym7vk, child number 0
-------------------------------------
insert into new_table_1 select s.id      , s.name      , ( select
avg(l.id)            from large_table l           where ceil(l.id /
1000) = s.id        ) avg_id   from small_table s
 
Plan hash value: 1885542926
 
--------------------------------------------------------------------------------------------------
| Id  | Operation                | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |             |      1 |        |      0 |00:00:03.25 |   58942 |
|   1 |  LOAD TABLE CONVENTIONAL |             |      1 |        |      0 |00:00:03.25 |   58942 |
|   2 |   SORT AGGREGATE         |             |    100 |      1 |    100 |00:00:03.25 |   58900 |
|*  3 |    TABLE ACCESS FULL     | LARGE_TABLE |    100 |   1000 |    100K|00:00:03.21 |   58900 |
|   4 |   TABLE ACCESS FULL      | SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       2 |
--------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter(CEIL("L"."ID"/1000)=:B1)
 
 
23 rows selected.

An additional LOAD TABLE CONVENTIONAL step, and the selection against large_table still visible. Next, I'm going to use the same query with an INSERT ALL statement.

SQL> rollback
  2  /
 
Rollback complete.
 
SQL> insert all
  2    when mod(id,2) = 0 then into new_table_1
  3    when mod(id,2) = 1 then into new_table_2
  4  select s.id
  5       , s.name
  6       , ( select avg(l.id)
  7             from large_table l
  8            where ceil(l.id / 1000) = s.id
  9         ) avg_id
10    from small_table s
11  /
 
100 rows created.
 
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2m5kj5cvf1jx1, child number 0
-------------------------------------
insert all   when mod(id,2) = 0 then into new_table_1   when mod(id,2)
= 1 then into new_table_2 select s.id      , s.name      , ( select
avg(l.id)            from large_table l           where ceil(l.id /
1000) = s.id        ) avg_id   from small_table s
 
Plan hash value: 1492625026
 
---------------------------------------------------------------------------------------------
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT    |             |      1 |        |      0 |00:00:03.27 |   58947 |
|   1 |  MULTI-TABLE INSERT |             |      1 |        |      0 |00:00:03.27 |   58947 |
|   2 |   TABLE ACCESS FULL | SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       2 |
|   3 |   INTO              | NEW_TABLE_1 |      0 |        |      0 |00:00:00.01 |       0 |
|   4 |   INTO              | NEW_TABLE_2 |      0 |        |      0 |00:00:00.01 |       0 |
---------------------------------------------------------------------------------------------
 
 
19 rows selected.

And look: the selections against large_table have disappeared. And it looks like the insertion is what takes 3.27 seconds. But the MULTI-TABLE INSERT step is just accumulating the time and buffers from the steps below, visible and invisible.

I repeated this test on 12.1.0.1. The optimizer in this version has improved, as can be seen by this plan for the exact same statement:

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT      |             |      1 |        |      0 |00:00:00.09 |     304 |       |       |          |
|   1 |  MULTI-TABLE INSERT   |             |      1 |        |      0 |00:00:00.09 |     304 |       |       |          |
|*  2 |   HASH JOIN OUTER     |             |      1 |    100K|    100 |00:00:00.09 |     294 |  1696K|  1696K| 1491K (0)|
|   3 |    TABLE ACCESS FULL  | SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
|   4 |    VIEW               | VW_SSQ_1    |      1 |    100K|    100 |00:00:00.09 |     291 |       |       |          |
|   5 |     SORT GROUP BY     |             |      1 |    100K|    100 |00:00:00.09 |     291 | 73728 | 73728 |          |
|   6 |      TABLE ACCESS FULL| LARGE_TABLE |      1 |    100K|    100K|00:00:00.03 |     291 |       |       |          |
|   7 |   INTO                | NEW_TABLE_1 |      0 |        |      0 |00:00:00.01 |       0 |       |       |          |
|   8 |   INTO                | NEW_TABLE_2 |      0 |        |      0 |00:00:00.01 |       0 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------

A much better plan: no more 100 starts of the query against large_table, but one nice HASH JOIN OUTER. However, the goal here is not to have the most efficient execution plan, but a plan equal to the 11.2.0.3 plan, that's why I added the optimizer hint NO_QUERY_TRANSFORMATION. And this is what the INSERT ALL looks like in 12.1.0.1:

SQL> insert all
  2    when mod(id,2) = 0 then into new_table_1
  3    when mod(id,2) = 1 then into new_table_2
  4  select /*+ no_query_transformation */ s.id
  5       , s.name
  6       , ( select avg(l.id)
  7             from large_table l
  8            where ceil(l.id / 1000) = s.id
  9         ) avg_id
 10    from small_table s
 11  /

100 rows created.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  atb16ym82sygz, child number 0
-------------------------------------
insert all   when mod(id,2) = 0 then into new_table_1   when mod(id,2)
= 1 then into new_table_2 select /*+ no_query_transformation */ s.id
  , s.name      , ( select avg(l.id)            from large_table l
     where ceil(l.id / 1000) = s.id        ) avg_id   from small_table s

Plan hash value: 220198846

---------------------------------------------------------------------------------------------
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT    |             |      1 |        |      0 |00:00:02.03 |   29113 |
|   1 |  MULTI-TABLE INSERT |             |      1 |        |      0 |00:00:02.03 |   29113 |
|   2 |   SORT AGGREGATE    |             |    100 |      1 |    100 |00:00:02.03 |   29100 |
|*  3 |    TABLE ACCESS FULL| LARGE_TABLE |    100 |   1000 |    100K|00:00:02.00 |   29100 |
|   4 |   VIEW              |             |      1 |    100 |    100 |00:00:02.03 |   29103 |
|   5 |    TABLE ACCESS FULL| SMALL_TABLE |      1 |    100 |    100 |00:00:00.01 |       3 |
|   6 |   INTO              | NEW_TABLE_1 |      0 |        |      0 |00:00:00.01 |       0 |
|   7 |   INTO              | NEW_TABLE_2 |      0 |        |      0 |00:00:00.01 |       0 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(CEIL("L"."ID"/1000)=:B1)

27 rows selected.

So in 12.1.0.1 the scalar subqueries against large_table are visible again. Problem solved, unless you're not on 12c yet.