This is my reproduction of the scenario on 11.2.0.3: a small table containing 100 rows, a larger table containing 100,000 rows, and two new tables where the rows are copied into.
SQL> create table small_table (id,name) 2 as 3 select level 4 , 'Name ' || to_char(level) 5 from dual 6 connect by level <= 100 7 / Table created. SQL> create table large_table (id,name) 2 as 3 select level 4 , 'Name ' || to_char(level) 5 from dual 6 connect by level <= 100000 7 / Table created. SQL> create table new_table_1 2 ( id number 3 , name varchar2(11) 4 , average_id number 5 ) 6 / Table created. SQL> create table new_table_2 2 ( id number 3 , name varchar2(11) 4 , average_id number 5 ) 6 / Table created. SQL> begin 2 dbms_stats.gather_table_stats(user,'small_table'); 3 dbms_stats.gather_table_stats(user,'large_table'); 4 dbms_stats.gather_table_stats(user,'new_table_1'); 5 dbms_stats.gather_table_stats(user,'new_table_2'); 6 end; 7 / PL/SQL procedure successfully completed.
For each of the 100 rows of the small_table, I'm selecting the average ID of 1000 rows in the large_table, as a scalar subquery.
The execution plan shows 100 starts of the scalar subquery on large_table. And we can see that it took 3.24 seconds.
SQL> set serveroutput off SQL> alter session set statistics_level = all 2 / Session altered. SQL> select s.id 2 , s.name 3 , ( select avg(l.id) 4 from large_table l 5 where ceil(l.id / 1000) = s.id 6 ) avg_id 7 from small_table s 8 / ID NAME AVG_ID ---------- --------------------------------------------- ---------- 1 Name 1 500,5 2 Name 2 1500,5 3 Name 3 2500,5 ...[95 lines removed]... 99 Name 99 98500,5 100 Name 100 99500,5 100 rows selected. SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')) 2 / PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------- SQL_ID 7167wkgunj23z, child number 0 ------------------------------------- select s.id , s.name , ( select avg(l.id) from large_table l where ceil(l.id / 1000) = s.id ) avg_id from small_table s Plan hash value: 1885542926 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 100 |00:00:00.01 | 9 | | 1 | SORT AGGREGATE | | 100 | 1 | 100 |00:00:03.28 | 58900 | |* 2 | TABLE ACCESS FULL| LARGE_TABLE | 100 | 1000 | 100K|00:00:03.24 | 58900 | | 3 | TABLE ACCESS FULL | SMALL_TABLE | 1 | 100 | 100 |00:00:00.01 | 9 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(CEIL("L"."ID"/1000)=:B1) 22 rows selected.
A full table scan against small_table and for each row of small_table (100) a full table scan against large_table. Now, let's see what the plan looks like when the same query is wrapped up in an INSERT SELECT statement.
SQL> insert into new_table_1 2 select s.id 3 , s.name 4 , ( select avg(l.id) 5 from large_table l 6 where ceil(l.id / 1000) = s.id 7 ) avg_id 8 from small_table s 9 / 100 rows created. SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')) 2 / PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------- SQL_ID dfg734y8ym7vk, child number 0 ------------------------------------- insert into new_table_1 select s.id , s.name , ( select avg(l.id) from large_table l where ceil(l.id / 1000) = s.id ) avg_id from small_table s Plan hash value: 1885542926 -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | -------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:03.25 | 58942 | | 1 | LOAD TABLE CONVENTIONAL | | 1 | | 0 |00:00:03.25 | 58942 | | 2 | SORT AGGREGATE | | 100 | 1 | 100 |00:00:03.25 | 58900 | |* 3 | TABLE ACCESS FULL | LARGE_TABLE | 100 | 1000 | 100K|00:00:03.21 | 58900 | | 4 | TABLE ACCESS FULL | SMALL_TABLE | 1 | 100 | 100 |00:00:00.01 | 2 | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(CEIL("L"."ID"/1000)=:B1) 23 rows selected.
An additional LOAD TABLE CONVENTIONAL step, and the selection against large_table still visible. Next, I'm going to use the same query with an INSERT ALL statement.
SQL> rollback 2 / Rollback complete. SQL> insert all 2 when mod(id,2) = 0 then into new_table_1 3 when mod(id,2) = 1 then into new_table_2 4 select s.id 5 , s.name 6 , ( select avg(l.id) 7 from large_table l 8 where ceil(l.id / 1000) = s.id 9 ) avg_id 10 from small_table s 11 / 100 rows created. SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')) 2 / PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------- SQL_ID 2m5kj5cvf1jx1, child number 0 ------------------------------------- insert all when mod(id,2) = 0 then into new_table_1 when mod(id,2) = 1 then into new_table_2 select s.id , s.name , ( select avg(l.id) from large_table l where ceil(l.id / 1000) = s.id ) avg_id from small_table s Plan hash value: 1492625026 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:03.27 | 58947 | | 1 | MULTI-TABLE INSERT | | 1 | | 0 |00:00:03.27 | 58947 | | 2 | TABLE ACCESS FULL | SMALL_TABLE | 1 | 100 | 100 |00:00:00.01 | 2 | | 3 | INTO | NEW_TABLE_1 | 0 | | 0 |00:00:00.01 | 0 | | 4 | INTO | NEW_TABLE_2 | 0 | | 0 |00:00:00.01 | 0 | --------------------------------------------------------------------------------------------- 19 rows selected.
And look: the selections against large_table have disappeared. And it looks like the insertion is what takes 3.27 seconds. But the MULTI-TABLE INSERT step is just accumulating the time and buffers from the steps below, visible and invisible.
I repeated this test on 12.1.0.1. The optimizer in this version has improved, as can be seen by this plan for the exact same statement:
-------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | -------------------------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:00.09 | 304 | | | | | 1 | MULTI-TABLE INSERT | | 1 | | 0 |00:00:00.09 | 304 | | | | |* 2 | HASH JOIN OUTER | | 1 | 100K| 100 |00:00:00.09 | 294 | 1696K| 1696K| 1491K (0)| | 3 | TABLE ACCESS FULL | SMALL_TABLE | 1 | 100 | 100 |00:00:00.01 | 3 | | | | | 4 | VIEW | VW_SSQ_1 | 1 | 100K| 100 |00:00:00.09 | 291 | | | | | 5 | SORT GROUP BY | | 1 | 100K| 100 |00:00:00.09 | 291 | 73728 | 73728 | | | 6 | TABLE ACCESS FULL| LARGE_TABLE | 1 | 100K| 100K|00:00:00.03 | 291 | | | | | 7 | INTO | NEW_TABLE_1 | 0 | | 0 |00:00:00.01 | 0 | | | | | 8 | INTO | NEW_TABLE_2 | 0 | | 0 |00:00:00.01 | 0 | | | | --------------------------------------------------------------------------------------------------------------------------
A much better plan: no more 100 starts of the query against large_table, but one nice HASH JOIN OUTER. However, the goal here is not to have the most efficient execution plan, but a plan equal to the 11.2.0.3 plan, that's why I added the optimizer hint NO_QUERY_TRANSFORMATION. And this is what the INSERT ALL looks like in 12.1.0.1:
SQL> insert all 2 when mod(id,2) = 0 then into new_table_1 3 when mod(id,2) = 1 then into new_table_2 4 select /*+ no_query_transformation */ s.id 5 , s.name 6 , ( select avg(l.id) 7 from large_table l 8 where ceil(l.id / 1000) = s.id 9 ) avg_id 10 from small_table s 11 / 100 rows created. SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last')) 2 / PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------------------------------------------- SQL_ID atb16ym82sygz, child number 0 ------------------------------------- insert all when mod(id,2) = 0 then into new_table_1 when mod(id,2) = 1 then into new_table_2 select /*+ no_query_transformation */ s.id , s.name , ( select avg(l.id) from large_table l where ceil(l.id / 1000) = s.id ) avg_id from small_table s Plan hash value: 220198846 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 0 |00:00:02.03 | 29113 | | 1 | MULTI-TABLE INSERT | | 1 | | 0 |00:00:02.03 | 29113 | | 2 | SORT AGGREGATE | | 100 | 1 | 100 |00:00:02.03 | 29100 | |* 3 | TABLE ACCESS FULL| LARGE_TABLE | 100 | 1000 | 100K|00:00:02.00 | 29100 | | 4 | VIEW | | 1 | 100 | 100 |00:00:02.03 | 29103 | | 5 | TABLE ACCESS FULL| SMALL_TABLE | 1 | 100 | 100 |00:00:00.01 | 3 | | 6 | INTO | NEW_TABLE_1 | 0 | | 0 |00:00:00.01 | 0 | | 7 | INTO | NEW_TABLE_2 | 0 | | 0 |00:00:00.01 | 0 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(CEIL("L"."ID"/1000)=:B1) 27 rows selected.
So in 12.1.0.1 the scalar subqueries against large_table are visible again. Problem solved, unless you're not on 12c yet.
Rob,
ReplyDeleteAlways a fan of your posts. Thanks for sharing that.