At CIBER we are very proud to announce that Cary Millsap will give his one day seminar called Mastering Oracle Trace Data in the Netherlands. The event will take place at the Carlton President Hotel at Utrecht on Wednesday, May 23. You can register and read more about this event here.
The seminar is aimed at DBA's, database application developers, data warehouse specialists and anyone caring about the speed of a database application. I'm sure that, among many other things, you'll leave the seminar with a very clear mindset about performance. If you are in doubt whether you should come or not, please read one of his many excellent papers, Thinking Clearly About Performance.
It's an opportunity you don't want to miss. Hope to see you there soon!
Wednesday, March 28, 2012
Mastering Oracle Trace Data with Cary Millsap
Posted by
Rob van Wijk
at
10:42 PM
2
comments
Tuesday, March 27, 2012
Third OGh APEX dag
Yesterday was the third annual APEX day, organized by the Dutch Oracle usergroup OGh. It's the biggest APEX only event in the world, I've been told, with approximately 280 attendees. Learco Brizzi, Marti Koppelmans and myself were very proud to again have a great lineup of presenters and presentations.
The day started with a keynote by Patrick Wolf telling about and showing a lot of new 4.2 features. Of course, he could not be sure if every single feature will eventually make it to this 4.2 release. The APEX team focussed on mobile development. I was mostly impressed by the demo showing how relatively easy it will be to develop a great looking iPhone application. A query, select a list type and voila. And I think this is why the Oracle community likes APEX so much: it makes developing applications easy. It reminded me of this presentation by Kathy Sierra: it makes you feel awesome.
Next up were the three parallel tracks. I heard quite a lot of people saying they were having a hard time making choices between great presentations, and I was no exception to that rule.
I saw Roel Hartman's "5 Cool Things you can do with HTML 5". Great presentation, especially the first real slide :-), loved the websockets demo with beaconpush, and I learned a lot.
Next presentation was John Scott's "APEX 4 - Error Handling enhancements". Having seen quite a number of John's presentations in the past, I know he always takes a subject to the next level with last years translation plugin as a highlight. This time he showed how easy it is to automatically log any errors in JIRA, so users don't have to report errors anymore: you already know it. Unfortunately, his last demo didn't work. He wanted to show how to a add a screenshot from the users browser automatically to the JIRA ticket.
The fourth presentation was by Sergei Martens called "Building desktop-like web applications with Ext JS & Apex". He was very enthusiastic about APEX applications using Ext JS, but he didn't succeed completely in explaining to me what aspects he liked so much. I mean, I agree some of the builtin themes are quite boring, but I don't think the video showed a "sexy" theme. To me, a great design is a simple design, but that's a matter of personal preference. Next, he had some severe troubles with a very slow Windows (a pleonasm?) laptop. This must be a presenters' nightmare, especially since he seemed well prepared, but this was out of his control. He handled the situation remarkably well though.
The last presentation of the day was Margreet den Hartigh's and Alex Nuijten's presentation called "From Zero to APEX". A customer story about a Uniface application that was rebuild using APEX, with a team of Uniface developers that not only had to learn APEX, but also PL/SQL, HTML, CSS, Subversion, Javascript and a lot more. Almost all Dutch municipalities use this application. It looks like they can win a lot more by integrating the databases of all municipalities to a single hosted database, with some partitioning, VPD and resource manager. The cost of ownership will decrease dramatically that way. Free tip from me :-)
A nice dinner with several visitors and presenters ended a great APEX day.
Posted by
Rob van Wijk
at
9:38 PM
0
comments
Sunday, March 25, 2012
Connect By Filtering
A hierarchical query is typically executed using a plan that starts with the operation CONNECT BY WITH FILTERING, which has two child operations. The first child operation implements the START WITH clause and the second child operation contains a step called CONNECT BY PUMP, implementing the recursive part of the query. Here is an example of such a plan using the well known hierarchical query on table EMP:
SQL> select lpad(' ', 2 * level - 2, ' ') || ename as ename
2 , level
3 , job
4 , deptno
5 from emp
6 connect by mgr = prior empno
7 start with mgr is null
8 /
ENAME LEVEL JOB DEPTNO
-------------------- ---------- --------------------------- ----------
KING 1 PRESIDENT 10
JONES 2 MANAGER 20
SCOTT 3 ANALYST 20
ADAMS 4 CLERK 20
FORD 3 ANALYST 20
SMITH 4 CLERK 20
BLAKE 2 MANAGER 30
ALLEN 3 SALESMAN 30
WARD 3 SALESMAN 30
MARTIN 3 SALESMAN 30
TURNER 3 SALESMAN 30
JAMES 3 CLERK 30
CLARK 2 MANAGER 10
MILLER 3 CLERK 10
14 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID d2c7xqxbr112u, child number 0
-------------------------------------
select lpad(' ', 2 * level - 2, ' ') || ename as ename , level , job , deptno from emp connect by
mgr = prior empno start with mgr is null
Plan hash value: 1869448388
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------
|* 1 | CONNECT BY WITH FILTERING| | 1 | | 14 |00:00:00.02 | 15 | 6 | 2048 | 2048 | 2048 (0)|
|* 2 | TABLE ACCESS FULL | EMP | 1 | 1 | 1 |00:00:00.01 | 3 | 6 | | | |
|* 3 | HASH JOIN | | 4 | | 13 |00:00:00.01 | 12 | 0 | 1452K| 1452K| 853K (0)|
| 4 | CONNECT BY PUMP | | 4 | | 14 |00:00:00.01 | 0 | 0 | | | |
| 5 | TABLE ACCESS FULL | EMP | 4 | 2 | 56 |00:00:00.01 | 12 | 0 | | | |
--------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("MGR"=PRIOR NULL)
2 - filter("MGR" IS NULL)
3 - access("MGR"=PRIOR NULL)
24 rows selected.
You can see a great and more detailed explanation of connect by with filtering here on Christian Antognini's blog.
When I was researching the new recursive subquery factoring clause one and a half year ago, and compared a standard hierarchical query on EMP using recursive subquery factoring with a query using the good old connect by, I stumbled upon a new optimizer algorithm for implementing recursive queries:
SQL> select lpad(' ', 2 * level - 2, ' ') || ename as ename
2 , level
3 , job
4 , deptno
5 from emp
6 connect by mgr = prior empno
7 start with mgr is null
8 /
ENAME LEVEL JOB DEPTNO
-------------------- ---------- --------- ----------
KING 1 PRESIDENT 10
JONES 2 MANAGER 20
SCOTT 3 ANALYST 20
ADAMS 4 CLERK 20
FORD 3 ANALYST 20
SMITH 4 CLERK 20
BLAKE 2 MANAGER 30
ALLEN 3 SALESMAN 30
WARD 3 SALESMAN 30
MARTIN 3 SALESMAN 30
TURNER 3 SALESMAN 30
JAMES 3 CLERK 30
CLARK 2 MANAGER 10
MILLER 3 CLERK 10
14 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID d2c7xqxbr112u, child number 0
-------------------------------------
select lpad(' ', 2 * level - 2, ' ') || ename as ename , level
, job , deptno from emp connect by mgr = prior empno
start with mgr is null
Plan hash value: 763482334
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.02 | 6 | 6 |
|* 1 | CONNECT BY NO FILTERING WITH START-WITH| | 1 | | 14 |00:00:00.02 | 6 | 6 |
| 2 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.02 | 6 | 6 |
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("MGR"=PRIOR NULL)
filter("MGR" IS NULL)
22 rows selected.
You might wonder what I did to make two exactly the same queries to use a different execution plan, but I'll address that later. First, I'd like to show there are two optimizer hints available, with which you can control which algorithm the optimizer uses:
SQL> select *
2 from v$sql_hint
3 where name like '%CONNECT_BY_FILTERING%'
4 /
NAME SQL_FEATURE CLASS
----------------------- ------------ -----------------------
INVERSE TARGET_LEVEL PROPERTY VERSION VERSION_OUTLINE
----------------------- ------------ ---------- ---------- ---------------
CONNECT_BY_FILTERING QKSFM_ALL CONNECT_BY_FILTERING
NO_CONNECT_BY_FILTERING 2 16 10.2.0.2 10.2.0.2
NO_CONNECT_BY_FILTERING QKSFM_ALL CONNECT_BY_FILTERING
CONNECT_BY_FILTERING 2 16 10.2.0.2 10.2.0.2
2 rows selected.
And this was surprising to me. As the version column suggests, the no_connect_by_filtering hint and the accompanying new algorithm were already introduced in version 10.2.0.2! I checked with my old 10.2.0.4 database and it is indeed present and can be used there:
SQL> select version
2 from v$instance
3 /
VERSION
---------------------------------------------------
10.2.0.4.0
1 row selected.
SQL> select /*+ no_connect_by_filtering gather_plan_statistics */
2 lpad(' ', 2 * level - 2, ' ') || ename as ename
3 , level
4 , job
5 , deptno
6 from emp
7 connect by mgr = prior empno
8 start with mgr is null
9 /
ENAME LEVEL JOB DEPTNO
-------------------- ---------- --------------------------- ----------
KING 1 PRESIDENT 10
JONES 2 MANAGER 20
SCOTT 3 ANALYST 20
ADAMS 4 CLERK 20
FORD 3 ANALYST 20
SMITH 4 CLERK 20
BLAKE 2 MANAGER 30
ALLEN 3 SALESMAN 30
WARD 3 SALESMAN 30
MARTIN 3 SALESMAN 30
TURNER 3 SALESMAN 30
JAMES 3 CLERK 30
CLARK 2 MANAGER 10
MILLER 3 CLERK 10
14 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 39kr5s8dxz7j0, child number 0
-------------------------------------
select /*+ no_connect_by_filtering gather_plan_statistics */ lpad(' ', 2 * level - 2, '
') || ename as ename , level , job , deptno from emp connect by mgr = prior
empno start with mgr is null
Plan hash value: 763482334
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
|* 1 | CONNECT BY NO FILTERING WITH START-WITH| | 1 | | 14 |00:00:00.01 | 3 |
| 2 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.01 | 3 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("MGR"=PRIOR NULL)
filter("MGR" IS NULL)
21 rows selected.
But you need the no_connect_by_filtering hint in version 10.2.0.4 for this query. If you do not provide the hint, this is the result:
SQL> select /*+ gather_plan_statistics */
2 lpad(' ', 2 * level - 2, ' ') || ename as ename
3 , level
4 , job
5 , deptno
6 from emp
7 connect by mgr = prior empno
8 start with mgr is null
9 /
ENAME LEVEL JOB DEPTNO
-------------------- ---------- --------------------------- ----------
KING 1 PRESIDENT 10
JONES 2 MANAGER 20
SCOTT 3 ANALYST 20
ADAMS 4 CLERK 20
FORD 3 ANALYST 20
SMITH 4 CLERK 20
BLAKE 2 MANAGER 30
ALLEN 3 SALESMAN 30
WARD 3 SALESMAN 30
MARTIN 3 SALESMAN 30
TURNER 3 SALESMAN 30
JAMES 3 CLERK 30
CLARK 2 MANAGER 10
MILLER 3 CLERK 10
14 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 6zhtnf720u0bm, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ lpad(' ', 2 * level - 2, ' ') || ename as ename , level
, job , deptno from emp connect by mgr = prior empno start with mgr is null
Plan hash value: 1869448388
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------
|* 1 | CONNECT BY WITH FILTERING| | 1 | | 14 |00:00:00.01 | 15 | 2048 | 2048 | 2048 (0)|
|* 2 | TABLE ACCESS FULL | EMP | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
|* 3 | HASH JOIN | | 4 | | 13 |00:00:00.01 | 12 | 1452K| 1452K| 843K (0)|
| 4 | CONNECT BY PUMP | | 4 | | 14 |00:00:00.01 | 0 | | | |
| 5 | TABLE ACCESS FULL | EMP | 4 | 2 | 56 |00:00:00.01 | 12 | | | |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("MGR"=PRIOR NULL)
2 - filter("MGR" IS NULL)
3 - access("MGR"=PRIOR NULL)
24 rows selected.
Which explains why I didn't see the CONNECT BY NO FILTERING WITH START-WITH earlier. It seems that Oracle has adjusted the cost calculation of connect by queries somewhere between 10.2.0.4 and 11.2.0.1. Just look at the cost from both execution plans on 10.2.0.4 using a regular explain plan statement and a "select * from table(dbms_xplan.display):
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 50 | 3 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING| | | | | |
|* 2 | TABLE ACCESS FULL | EMP | 1 | 29 | 3 (0)| 00:00:01 |
|* 3 | HASH JOIN | | | | | |
| 4 | CONNECT BY PUMP | | | | | |
| 5 | TABLE ACCESS FULL | EMP | 2 | 50 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 350 | 3 (0)| 00:00:01 |
|* 1 | CONNECT BY NO FILTERING WITH START-WITH| | | | | |
| 2 | TABLE ACCESS FULL | EMP | 14 | 350 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
The cost of 3 is due to the full table scan of EMP, and no additional cost is added for the hierarchical query.
These are the plans from 11.2.0.2:
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 156 | 15 (20)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING| | | | | |
|* 2 | TABLE ACCESS FULL | EMP | 1 | 25 | 4 (0)| 00:00:01 |
|* 3 | HASH JOIN | | 2 | 76 | 9 (12)| 00:00:01 |
| 4 | CONNECT BY PUMP | | | | | |
|* 5 | TABLE ACCESS FULL | EMP | 13 | 325 | 4 (0)| 00:00:01 |
----------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 728 | 5 (20)| 00:00:01 |
|* 1 | CONNECT BY NO FILTERING WITH START-WITH| | | | | |
| 2 | TABLE ACCESS FULL | EMP | 14 | 350 | 4 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
The numbers from the 11.2.0.2 show more sophistication than just the cost of the table scan. The optimizer can't know how many levels deep the data is, but version 10.2.0.4 apparently picked 1, and left the total cost unchanged from 3 to 3. I'm curious to know in which version in between 10.2.0.4 and 11.2.0.2 this cost calculation changed. If anyone who is reading this, has a version in between and likes to check, please let me know in the comments. My guess would be that 11.2.0.1 contained the cost change.
What does CONNECT BY NO FILTERING WITH START-WITH do?
Let's explore this, using this table:
SQL> create table t (id, parent_id, value, indicator)
2 as
3 select level - 1
4 , case level when 1 then null else trunc((level-1)/10) end
5 , round(dbms_random.value * 1000)
6 , case mod(level,10) when 4 then 'N' else 'Y' end
7 from dual
8 connect by level <= 100000
9 /
Table created.
SQL> alter table t
2 add constraint cbt_pk
3 primary key (id)
4 /
Table altered.
SQL> create index i1 on t (parent_id,indicator)
2 /
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true)
The data is tree shaped where each parent node has exactly 9 child nodes. One tenth of the data, with an id that ends with the digit 3, has its indicator column set to 'N'. This select query will make it clearer how the data looks like:
SQL> select *
2 from t
3 where id < 24 or id > 99997
4 order by id
5 /
ID PARENT_ID VALUE I
---------- ---------- ---------- -
0 656 Y
1 0 289 Y
2 0 365 Y
3 0 644 N
4 0 364 Y
5 0 841 Y
6 0 275 Y
7 0 529 Y
8 0 500 Y
9 0 422 Y
10 1 598 Y
11 1 104 Y
12 1 467 Y
13 1 296 N
14 1 105 Y
15 1 220 Y
16 1 692 Y
17 1 793 Y
18 1 29 Y
19 1 304 Y
20 2 467 Y
21 2 716 Y
22 2 837 Y
23 2 432 N
99998 9999 609 Y
99999 9999 24 Y
26 rows selected.
When hearing the word "filter", I almost immediately associate it with a WHERE clause. But a where clause in a connect by query, is not what is meant by connect by filtering. The documentation states:
Oracle processes hierarchical queries as follows:
A join, if present, is evaluated first, whether the join is specified in the FROM clause or with WHERE clause predicates.
The CONNECT BY condition is evaluated.
Any remaining WHERE clause predicates are evaluated.
So a where clause predicate is evaluated AFTER the connect by has done its job. You can see that happening here:
SQL> explain plan
2 for
3 select id
4 , parent_id
5 , sys_connect_by_path(id,'->') scbp
6 from t
7 where indicator = 'N'
8 connect by parent_id = prior id
9 start with parent_id is null
10 /
Explained.
SQL> select * from table(dbms_xplan.display)
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2502271019
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 319 | 164 (3)| 00:00:02 |
|* 1 | FILTER | | | | | |
|* 2 | CONNECT BY WITH FILTERING | | | | | |
|* 3 | TABLE ACCESS FULL | T | 1 | 11 | 80 (2)| 00:00:01 |
| 4 | NESTED LOOPS | | 10 | 240 | 82 (2)| 00:00:01 |
| 5 | CONNECT BY PUMP | | | | | |
| 6 | TABLE ACCESS BY INDEX ROWID| T | 10 | 110 | 2 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | I1 | 10 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("INDICATOR"='N')
2 - access("PARENT_ID"=PRIOR "ID")
3 - filter("PARENT_ID" IS NULL)
7 - access("PARENT_ID"="connect$_by$_pump$_002"."prior id ")
22 rows selected.
The "indicator = 'N'" predicate is at step 1, which is executed after the CONNECT BY WITH FILTERING at step 2. Note that although this query is executed in 11.2.0.2, the optimizer has chosen the old CONNECT BY WITH FILTERING.
Connect by filtering is done by using filters in your CONNECT BY clause. Here is an example using the predicate "indicator = 'N'" inside the CONNECT BY clause:
SQL> select id
2 , parent_id
3 , sys_connect_by_path(id,'->') scbp
4 from t
5 connect by parent_id = prior id
6 and indicator = 'N'
7 start with parent_id is null
8 /
ID PARENT_ID SCBP
---------- ---------- --------------------------------------------------
0 ->0
3 0 ->0->3
33 3 ->0->3->33
333 33 ->0->3->33->333
3333 333 ->0->3->33->333->3333
33333 3333 ->0->3->33->333->3333->33333
6 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID dzkjzrrzgnvd5, child number 0
-------------------------------------
select id , parent_id , sys_connect_by_path(id,'->') scbp
from t connect by parent_id = prior id and indicator = 'N'
start with parent_id is null
Plan hash value: 3164577763
---------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 294 | | | |
|* 1 | CONNECT BY WITH FILTERING | | 1 | | 6 |00:00:00.01 | 294 | 2048 | 2048 | 2048 (0)|
|* 2 | TABLE ACCESS FULL | T | 1 | 1 | 1 |00:00:00.01 | 277 | | | |
| 3 | NESTED LOOPS | | 6 | 5 | 5 |00:00:00.01 | 17 | | | |
| 4 | CONNECT BY PUMP | | 6 | | 6 |00:00:00.01 | 0 | | | |
| 5 | TABLE ACCESS BY INDEX ROWID| T | 6 | 5 | 5 |00:00:00.01 | 17 | | | |
|* 6 | INDEX RANGE SCAN | I1 | 6 | 5 | 5 |00:00:00.01 | 12 | | | |
---------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("PARENT_ID"=PRIOR NULL)
2 - filter("PARENT_ID" IS NULL)
6 - access("PARENT_ID"="connect$_by$_pump$_002"."prior id " AND "INDICATOR"='N')
27 rows selected.
In the A-rows column, you can see that the connect by filtering was effective here. Only the necessary rows were being read. And this is the key difference between the two connect by algorithms: with CONNECT BY WITH FILTERING, you can filter within each recursion, whereas CONNECT BY NO FILTERING WITH START-WITH has to read everything, does an in-memory operation, and return the result. With this example, the latter is much less efficient:
SQL> select /*+ no_connect_by_filtering */ id
2 , parent_id
3 , sys_connect_by_path(id,'->') scbp
4 from t
5 connect by parent_id = prior id
6 and indicator = 'N'
7 start with parent_id is null
8 /
ID PARENT_ID SCBP
---------- ---------- --------------------------------------------------
0 ->0
3 0 ->0->3
33 3 ->0->3->33
333 33 ->0->3->33->333
3333 333 ->0->3->33->333->3333
33333 3333 ->0->3->33->333->3333->33333
6 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 3fcr31tp83by9, child number 0
-------------------------------------
select /*+ no_connect_by_filtering */ id , parent_id ,
sys_connect_by_path(id,'->') scbp from t connect by parent_id =
prior id and indicator = 'N' start with parent_id is null
Plan hash value: 2303479083
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.14 | 277 |
|* 1 | CONNECT BY NO FILTERING WITH START-WITH| | 1 | | 6 |00:00:00.14 | 277 |
| 2 | TABLE ACCESS FULL | T | 1 | 100K| 100K|00:00:00.01 | 277 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("PARENT_ID"=PRIOR NULL)
filter("PARENT_ID" IS NULL)
22 rows selected.
100K rows were being read, and the A-time was 0.14 seconds instead of 0.01 seconds. I wondered where those 0.14 seconds went to, since the plan shows it's NOT for the full table scan. Using Tom Kyte's runstats_pkg reveals this:
SQL> declare
2 cursor c1
3 is
4 select /*+ connect_by_filtering */ id
5 , parent_id
6 , sys_connect_by_path(id,'->') scbp
7 from t
8 connect by parent_id = prior id
9 and indicator = 'N'
10 start with parent_id is null
11 ;
12 cursor c2
13 is
14 select /*+ no_connect_by_filtering */ id
15 , parent_id
16 , sys_connect_by_path(id,'->') scbp
17 from t
18 connect by parent_id = prior id
19 and indicator = 'N'
20 start with parent_id is null
21 ;
22 begin
23 runstats_pkg.rs_start;
24 for r in c1 loop null; end loop;
25 runstats_pkg.rs_middle;
26 for r in c2 loop null; end loop;
27 runstats_pkg.rs_stop;
28 end;
29 /
Run1 ran in 0 hsecs
Run2 ran in 10 hsecs
run 1 ran in 0% of the time
Name Run1 Run2 Diff
STAT...HSC Heap Segment Block 16 15 -1
STAT...db block changes 48 47 -1
STAT...consistent gets - exami 9 8 -1
STAT...db block gets from cach 32 33 1
STAT...db block gets 32 33 1
STAT...redo subscn max counts 0 1 1
STAT...redo ordering marks 0 1 1
STAT...redo entries 16 15 -1
STAT...calls to kcmgas 0 1 1
STAT...calls to kcmgcs 29 28 -1
STAT...free buffer requested 0 1 1
STAT...Heap Segment Array Inse 16 15 -1
STAT...consistent changes 32 31 -1
STAT...heap block compress 9 8 -1
STAT...parse time cpu 1 0 -1
STAT...buffer is pinned count 1 0 -1
STAT...session cursor cache co 1 0 -1
STAT...sql area evicted 1 0 -1
LATCH.undo global data 11 10 -1
LATCH.SQL memory manager worka 3 5 2
LATCH.messages 0 2 2
LATCH.OS process allocation 0 2 2
LATCH.simulator hash latch 20 23 3
LATCH.object queue header oper 4 1 -3
STAT...workarea executions - o 10 6 -4
STAT...table fetch by rowid 15 10 -5
STAT...index scans kdiixs1 6 0 -6
LATCH.row cache objects 280 274 -6
STAT...sorts (memory) 8 2 -6
STAT...CPU used by this sessio 2 11 9
STAT...Elapsed Time 1 11 10
STAT...recursive cpu usage 2 12 10
STAT...no work - consistent re 300 284 -16
STAT...buffer is not pinned co 36 20 -16
STAT...session logical reads 354 337 -17
STAT...consistent gets from ca 313 296 -17
STAT...consistent gets from ca 322 304 -18
LATCH.shared pool 186 168 -18
STAT...consistent gets 322 304 -18
LATCH.shared pool simulator 23 4 -19
LATCH.cache buffers chains 785 740 -45
STAT...undo change vector size 3,500 3,420 -80
STAT...redo size 4,652 4,560 -92
STAT...session uga memory 0 -65,488 -65,488
STAT...session pga memory 0 -65,536 -65,536
STAT...sorts (rows) 12 100,001 99,989
Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
1,467 1,384 -83 106.00%
PL/SQL procedure successfully completed
The major difference is the number of rows sorted! The CONNECT BY NO FILTERING WITH START-WITH sorts all 100K rows. This is a surprise, because normally when you sort, you use memory from the PGA workarea, which shows up in your memory statistics from your execution plan. But the no filtering plan did not show those statistics (OMem, 1Mem, Used-Mem). I have no explanation for this phenomenon yet.
Let's zoom in on the sorting:
SQL> select sn.name
2 , ms.value
3 from v$mystat ms
4 , v$statname sn
5 where ms.statistic# = sn.statistic#
6 and sn.name like '%sort%'
7 /
NAME VALUE
----------------------- ----------
sorts (memory) 2278
sorts (disk) 0
sorts (rows) 9425510
3 rows selected.
SQL> select id
2 , parent_id
3 , sys_connect_by_path(id,'->') scbp
4 from t
5 connect by parent_id = prior id
6 and indicator = 'N'
7 start with parent_id is null
8 /
ID PARENT_ID SCBP
---------- ---------- --------------------------------------------------
0 ->0
3 0 ->0->3
33 3 ->0->3->33
333 33 ->0->3->33->333
3333 333 ->0->3->33->333->3333
33333 3333 ->0->3->33->333->3333->33333
6 rows selected.
SQL> select sn.name
2 , ms.value
3 from v$mystat ms
4 , v$statname sn
5 where ms.statistic# = sn.statistic#
6 and sn.name like '%sort%'
7 /
NAME VALUE
----------------------- ----------
sorts (memory) 2286
sorts (disk) 0
sorts (rows) 9425522
3 rows selected.
SQL> select /*+ no_connect_by_filtering */ id
2 , parent_id
3 , sys_connect_by_path(id,'->') scbp
4 from t
5 connect by parent_id = prior id
6 and indicator = 'N'
7 start with parent_id is null
8 /
ID PARENT_ID SCBP
---------- ---------- --------------------------------------------------
0 ->0
3 0 ->0->3
33 3 ->0->3->33
333 33 ->0->3->33->333
3333 333 ->0->3->33->333->3333
33333 3333 ->0->3->33->333->3333->33333
6 rows selected.
SQL> select sn.name
2 , ms.value
3 from v$mystat ms
4 , v$statname sn
5 where ms.statistic# = sn.statistic#
6 and sn.name like '%sort%'
7 /
NAME VALUE
----------------------- ----------
sorts (memory) 2288
sorts (disk) 0
sorts (rows) 9525523
3 rows selected.
So CONNECT BY WITH FILTERING did 8 sorts (2286 - 2278) and sorted 12 rows (9425522 - 9425510), whereas CONNECT BY NO FILTERING WITH START-WITH did 2 (2288 - 2286) sorts and sorted 100,001 rows (9525523 - 9425522).
And finally, I promised to explain why the first two queries of this blogpost are identical, but show a different execution plan. The reason is simple: the first one is executed on 10.2.0.4 and the second one on 11.2.0.2.
Posted by
Rob van Wijk
at
11:50 PM
6
comments
Sunday, September 25, 2011
EXISTS
Most of us know the SQL keyword EXISTS as the condition with a subquery you use in a WHERE clause. But if you look at the documentation of EXISTS, you'll see that it says nothing about just using it in a WHERE clause. It's just a sort of function that accepts a query as input and returns "TRUE if a subquery returns at least one row". The SQL language doesn't know about booleans, but it calls these "conditions". And according to another part of the documentation about SQL conditions:You can use a condition in the WHERE clause of these statements:
DELETE
SELECT
UPDATE
You can use a condition in any of these clauses of the SELECT statement:
WHERE
START WITH
CONNECT BY
HAVING
So you might be tempted to believe it is not possible to use EXISTS in the SELECT clause. However, in a CASE expression, more specifically the searched case expression, it expects a "condition" after the WHEN, as you can see in the syntax wire diagram. Meaning we can put an EXISTS condition after the WHEN.
An example of this came up recently when a colleague asked me for help with his conversion project with a query. His setup resembles this test case:
SQL> create table t1 (id)
2 as
3 select level
4 from dual
5 connect by level <= 4
6 /
Table created.
SQL> alter table t1 add constraint t1_pk primary key (id)
2 /
Table altered.
SQL> exec dbms_stats.gather_table_stats(user,'t1',cascade=>true)
PL/SQL procedure successfully completed.
SQL> create table t2 (id, t1_id, status, filler)
2 as
3 select 1, 1, 'w', lpad('*',1000,'*') from dual union all
4 select 2, 2, 'x', lpad('*',1000,'*') from dual union all
5 select 3, 2, 'y', lpad('*',1000,'*') from dual union all
6 select 4, 3, 'w', lpad('*',1000,'*') from dual union all
7 select 5, 3, 'x', lpad('*',1000,'*') from dual union all
8 select 6, 3, 'z', lpad('*',1000,'*') from dual union all
9 select 7, 4, 'w', lpad('*',1000,'*') from dual union all
10 select 8, 4, 'z', lpad('*',1000,'*') from dual
11 /
Table created.
SQL> alter table t2 add constraint t2_pk primary key (id)
2 /
Table altered.
SQL> alter table t2 add constraint t2_t1_fk foreign key (t1_id) references t1(id)
2 /
Table altered.
SQL> create index t2_i on t2 (status,t1_id)
2 /
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'t2',cascade=>true)
PL/SQL procedure successfully completed.
Two tables with a master detail relationship. The question for me was to come up with a query that selects all rows of t1 and if there is at least one row in t2 with status x or y, then print a 'yes' in an extra column, and 'no' otherwise.
The first query that I came up with, was to use a simple outer join and remove the duplicates with a distinct.
SQL> select /*+ gather_plan_statistics */
2 distinct t1.id
3 , nvl2(t2.id,'yes','no') exists_t2
4 from t1
5 left outer join t2
6 on ( t1.id = t2.t1_id
7 and t2.status in ('x','y')
8 )
9 order by t1.id
10 /
ID EXISTS_T2
----- ----------
1 no
2 yes
3 yes
4 no
4 rows selected.
And it is done in 4 consistent gets using this plan:
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 2mtksgp0pzx59, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ distinct t1.id ,
nvl2(t2.id,'yes','no') exists_t2 from t1 left outer join t2
on ( t1.id = t2.t1_id and t2.status in ('x','y')
) order by t1.id
Plan hash value: 1021492722
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4 |00:00:00.01 | 4 | | | |
| 1 | SORT UNIQUE | | 1 | 4 | 4 |00:00:00.01 | 4 | 2048 | 2048 | 2048 (0)|
| 2 | MERGE JOIN OUTER | | 1 | 4 | 5 |00:00:00.01 | 4 | | | |
| 3 | INDEX FULL SCAN | T1_PK | 1 | 4 | 4 |00:00:00.01 | 1 | | | |
|* 4 | SORT JOIN | | 4 | 4 | 3 |00:00:00.01 | 3 | 2048 | 2048 | 2048 (0)|
| 5 | INLIST ITERATOR | | 1 | | 3 |00:00:00.01 | 3 | | | |
| 6 | TABLE ACCESS BY INDEX ROWID| T2 | 2 | 4 | 3 |00:00:00.01 | 3 | | | |
|* 7 | INDEX RANGE SCAN | T2_I | 2 | 4 | 3 |00:00:00.01 | 2 | | | |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."ID"="T2"."T1_ID")
filter("T1"."ID"="T2"."T1_ID")
7 - access(("T2"."STATUS"='x' OR "T2"."STATUS"='y'))
29 rows selected.
Later, I offered an alternative, after realizing I can also use EXISTS in the SELECT clause, like this:
SQL> select /*+ gather_plan_statistics */
2 t1.id
3 , case
4 when exists
5 ( select ''
6 from t2
7 where t2.t1_id = t1.id
8 and t2.status in ('x','y')
9 )
10 then 'yes'
11 else 'no'
12 end exists_t2
13 from t1
14 order by t1.id
15 /
ID EXISTS_T2
----- ----------
1 no
2 yes
3 yes
4 no
4 rows selected.
So, that's the proof that you can use an EXISTS in a SELECT clause. But this query uses 8 consistent gets (6+2):
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID b13u3f5m9ybuu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ t1.id , case
when exists ( select '' from t2 where
t2.t1_id = t1.id and t2.status in ('x','y') )
then 'yes' else 'no' end exists_t2 from t1 order
by t1.id
Plan hash value: 1623664937
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4 |00:00:00.01 | 2 |
| 1 | INLIST ITERATOR | | 4 | | 2 |00:00:00.01 | 6 |
|* 2 | INDEX RANGE SCAN| T2_I | 6 | 1 | 2 |00:00:00.01 | 6 |
| 3 | INDEX FULL SCAN | T1_PK | 1 | 4 | 4 |00:00:00.01 | 2 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access((("T2"."STATUS"='x' OR "T2"."STATUS"='y')) AND
"T2"."T1_ID"=:B1)
25 rows selected.
So not really interesting with this data set, although the plan is shorter in number of operations. With other data distributions it can be useful though. For example, when blowing up the t2 table:
SQL> insert into t2
2 select n * 1000 + t2.id
3 , t2.t1_id
4 , t2.status
5 , t2.filler
6 from t2
7 cross join (select level n from dual connect by level <= 1000)
8 /
8000 rows created.
SQL> exec dbms_stats.gather_table_stats(user,'t2',cascade=>true,method_opt=>'for all columns size auto')
PL/SQL procedure successfully completed.
Now, the left outer join with the distinct has to join 4 rows with 3003 joins, and the resulting SQL uses 3017 consistent gets:
SQL> select /*+ gather_plan_statistics */
2 distinct t1.id
3 , nvl2(t2.id,'yes','no') exists_t2
4 from t1
5 left outer join t2
6 on ( t1.id = t2.t1_id
7 and t2.status in ('x','y')
8 )
9 order by t1.id
10 /
ID EXISTS_T2
----- ----------
1 no
2 yes
3 yes
4 no
4 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 2mtksgp0pzx59, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ distinct t1.id ,
nvl2(t2.id,'yes','no') exists_t2 from t1 left outer join t2
on ( t1.id = t2.t1_id and t2.status in ('x','y')
) order by t1.id
Plan hash value: 1021492722
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4 |00:00:00.01 | 3017 | | | |
| 1 | SORT UNIQUE | | 1 | 4 | 4 |00:00:00.01 | 3017 | 2048 | 2048 | 2048 (0)|
| 2 | MERGE JOIN OUTER | | 1 | 4 | 3005 |00:00:00.01 | 3017 | | | |
| 3 | INDEX FULL SCAN | T1_PK | 1 | 4 | 4 |00:00:00.01 | 1 | | | |
|* 4 | SORT JOIN | | 4 | 4 | 3003 |00:00:00.01 | 3016 | 115K| 115K| 102K (0)|
| 5 | INLIST ITERATOR | | 1 | | 3003 |00:00:00.01 | 3016 | | | |
| 6 | TABLE ACCESS BY INDEX ROWID| T2 | 2 | 4 | 3003 |00:00:00.01 | 3016 | | | |
|* 7 | INDEX RANGE SCAN | T2_I | 2 | 4 | 3003 |00:00:00.01 | 13 | | | |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."ID"="T2"."T1_ID")
filter("T1"."ID"="T2"."T1_ID")
7 - access(("T2"."STATUS"='x' OR "T2"."STATUS"='y'))
29 rows selected.
And the second query is now a whole lot more efficient with only 15 consistent gets:
SQL> select /*+ gather_plan_statistics */
2 t1.id
3 , case
4 when exists
5 ( select ''
6 from t2
7 where t2.t1_id = t1.id
8 and t2.status in ('x','y')
9 )
10 then 'yes'
11 else 'no'
12 end exists_t2
13 from t1
14 order by t1.id
15 /
ID EXISTS_T2
----- ----------
1 no
2 yes
3 yes
4 no
4 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID b13u3f5m9ybuu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ t1.id , case
when exists ( select '' from t2 where
t2.t1_id = t1.id and t2.status in ('x','y') )
then 'yes' else 'no' end exists_t2 from t1 order
by t1.id
Plan hash value: 1623664937
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4 |00:00:00.01 | 2 |
| 1 | INLIST ITERATOR | | 4 | | 2 |00:00:00.01 | 13 |
|* 2 | INDEX RANGE SCAN| T2_I | 6 | 1 | 2 |00:00:00.01 | 13 |
| 3 | INDEX FULL SCAN | T1_PK | 1 | 4 | 4 |00:00:00.01 | 2 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access((("T2"."STATUS"='x' OR "T2"."STATUS"='y')) AND
"T2"."T1_ID"=:B1)
25 rows selected.
The query selects the four rows of T1 by the INDEX FULL SCAN in line 3. The six starts of line 2 can be explained by the INLIST ITERATOR that splittes the T2 access into one for status = 'x' and one for status = 'y'. It searches for the 'x' for all four rows of T1. For t1.id 2 and 3, it finds an 'x' value, and it doesn't need to look for an 'y' value. Only for t1.id 1 and 4, it needs to search for an 'y' value.
Note that not only the data distribution is a factor, but the index on (t1_id,status) is needed as well to make a difference here.
I guess I just want to say that there is an alternative that can be useful in certain situations, that you might not have thought of initially.
PS: Something completely different: Kamran Agayev has posted an interview with me.
Posted by
Rob van Wijk
at
12:03 PM
2
comments
Tuesday, July 12, 2011
Partitioned outer join bug
I have experienced a strange bug when constructing an example showing how a partitioned outer join works. I used a setup which resembles a situation I have consulted about recently for timeseries data. The table can hold multiple timeseries whose data is coming from multiple sources. The data itself contains volumes that are reported for every 5 minutes. This table looks like this:
SQL> create table tradevolumes
2 ( timeseries_id number(4)
3 , tradedate date
4 , source_id number(4)
5 , volume number(4)
6 , constraint tradevolumes_pk primary key (timeseries_id,tradedate,source_id)
7 ) organization index
8 /
Table created.
An index organized table, which we fill with some data: 2 timeseries from 3 sources, and 108 five-minute-periods:
SQL> insert into tradevolumes
2 select 1 + mod(level-1,2)
3 , trunc(sysdate) + numtodsinterval(5*mod(trunc((level-1)/2),108),'minute')
4 , ceil(level/(2 * 108))
5 , ceil(dbms_random.value(0,1000))
6 from dual
7 connect by level <= 2 * 108 * 3
8 /
648 rows created.
A common problem for timeseries data is missing data. I'll simulate that by deleting several rows:
SQL> delete tradevolumes
2 where ( tradedate
3 between trunc(sysdate) + interval '35' minute
4 and trunc(sysdate) + interval '50' minute
5 or (timeseries_id = 1 and source_id = 3 and volume < 30)
6 or (timeseries_id = 2 and source_id = 2 and tradedate > trunc(sysdate) + interval '20' minute)
7 )
8 /
127 rows deleted.
And now gather some basic statistics for the table:
SQL> exec dbms_stats.gather_table_stats(user,'tradevolumes')
PL/SQL procedure successfully completed.
To get an idea of the data that is now in the table:
SQL> select *
2 from tradevolumes
3 order by timeseries_id
4 , tradedate
5 , source_id
6 /
TIMESERIES_ID TRADEDATE SOURCE_ID VOLUME
------------- ------------------- ---------- ----------
1 12-07-2011 00:00:00 1 400
1 12-07-2011 00:00:00 2 265
1 12-07-2011 00:00:00 3 307
1 12-07-2011 00:05:00 1 223
1 12-07-2011 00:05:00 2 975
1 12-07-2011 00:05:00 3 173
1 12-07-2011 00:10:00 1 794
1 12-07-2011 00:10:00 2 17
1 12-07-2011 00:10:00 3 468
1 12-07-2011 00:15:00 1 552
1 12-07-2011 00:15:00 2 979
1 12-07-2011 00:15:00 3 84
1 12-07-2011 00:20:00 1 73
1 12-07-2011 00:20:00 2 710
1 12-07-2011 00:20:00 3 54
1 12-07-2011 00:25:00 1 271
1 12-07-2011 00:25:00 2 643
1 12-07-2011 00:25:00 3 443
1 12-07-2011 00:30:00 1 408
1 12-07-2011 00:30:00 2 526
1 12-07-2011 00:30:00 3 725
1 12-07-2011 00:55:00 1 13
1 12-07-2011 00:55:00 2 943
1 12-07-2011 00:55:00 3 319
... skipped a lot of rows here ...
1 12-07-2011 08:50:00 1 239
1 12-07-2011 08:50:00 2 502
1 12-07-2011 08:50:00 3 297
1 12-07-2011 08:55:00 1 620
1 12-07-2011 08:55:00 2 531
1 12-07-2011 08:55:00 3 377
2 12-07-2011 00:00:00 1 778
2 12-07-2011 00:00:00 2 302
2 12-07-2011 00:00:00 3 959
2 12-07-2011 00:05:00 1 920
2 12-07-2011 00:05:00 2 827
2 12-07-2011 00:05:00 3 395
... skipped a lot of rows here ...
2 12-07-2011 08:45:00 1 495
2 12-07-2011 08:45:00 3 519
2 12-07-2011 08:50:00 1 556
2 12-07-2011 08:50:00 3 193
2 12-07-2011 08:55:00 1 368
2 12-07-2011 08:55:00 3 18
521 rows selected.
To show how a partitioned outer join works, my plan is to add all missing key combinations to the result set for the first hour. So the query only uses the times between 0:00 to 0:55 and we'll ignore the rows between 1:00 and 8:55. This is the data for the first hour:
SQL> select *
2 from tradevolumes
3 where tradedate between trunc(sysdate) and trunc(sysdate) + interval '55' minute
4 order by timeseries_id
5 , tradedate
6 , source_id
7 /
TIMESERIES_ID TRADEDATE SOURCE_ID VOLUME
------------- ------------------- ---------- ----------
1 12-07-2011 00:00:00 1 400
1 12-07-2011 00:00:00 2 265
1 12-07-2011 00:00:00 3 307
1 12-07-2011 00:05:00 1 223
1 12-07-2011 00:05:00 2 975
1 12-07-2011 00:05:00 3 173
1 12-07-2011 00:10:00 1 794
1 12-07-2011 00:10:00 2 17
1 12-07-2011 00:10:00 3 468
1 12-07-2011 00:15:00 1 552
1 12-07-2011 00:15:00 2 979
1 12-07-2011 00:15:00 3 84
1 12-07-2011 00:20:00 1 73
1 12-07-2011 00:20:00 2 710
1 12-07-2011 00:20:00 3 54
1 12-07-2011 00:25:00 1 271
1 12-07-2011 00:25:00 2 643
1 12-07-2011 00:25:00 3 443
1 12-07-2011 00:30:00 1 408
1 12-07-2011 00:30:00 2 526
1 12-07-2011 00:30:00 3 725
1 12-07-2011 00:55:00 1 13
1 12-07-2011 00:55:00 2 943
1 12-07-2011 00:55:00 3 319
2 12-07-2011 00:00:00 1 778
2 12-07-2011 00:00:00 2 302
2 12-07-2011 00:00:00 3 959
2 12-07-2011 00:05:00 1 920
2 12-07-2011 00:05:00 2 827
2 12-07-2011 00:05:00 3 395
2 12-07-2011 00:10:00 1 534
2 12-07-2011 00:10:00 2 156
2 12-07-2011 00:10:00 3 2
2 12-07-2011 00:15:00 1 642
2 12-07-2011 00:15:00 2 278
2 12-07-2011 00:15:00 3 76
2 12-07-2011 00:20:00 1 472
2 12-07-2011 00:20:00 2 210
2 12-07-2011 00:20:00 3 549
2 12-07-2011 00:25:00 1 349
2 12-07-2011 00:25:00 3 332
2 12-07-2011 00:30:00 1 371
2 12-07-2011 00:30:00 3 396
2 12-07-2011 00:55:00 1 408
2 12-07-2011 00:55:00 3 818
45 rows selected.
The 12 five-minute-periods should be present for the 2 timeseries and 3 sources, so we need 72 rows. Now watch the bug with a standard partitioned outer join:
SQL> with all_tradedates as
2 ( select trunc(sysdate) + numtodsinterval(5*(level-1),'minute') tradedate
3 from dual
4 connect by level <= 12
5 )
6 select v.timeseries_id
7 , a.tradedate
8 , v.source_id
9 , v.volume
10 from all_tradedates a
11 left outer join tradevolumes v
12 partition by (v.timeseries_id,v.source_id)
13 on (a.tradedate = v.tradedate)
14 order by v.timeseries_id
15 , a.tradedate
16 , v.source_id
17 /
TIMESERIES_ID TRADEDATE SOURCE_ID VOLUME
------------- ------------------- ---------- ----------
1 12-07-2011 00:00:00 1 400
1 12-07-2011 00:00:00 2 265
1 12-07-2011 00:00:00 3 307
1 12-07-2011 00:05:00 1 400
1 12-07-2011 00:05:00 2 265
1 12-07-2011 00:05:00 3 307
1 12-07-2011 00:10:00 1 400
1 12-07-2011 00:10:00 2 265
1 12-07-2011 00:10:00 3 307
1 12-07-2011 00:15:00 1 400
1 12-07-2011 00:15:00 2 265
1 12-07-2011 00:15:00 3 307
1 12-07-2011 00:20:00 1 400
1 12-07-2011 00:20:00 2 265
1 12-07-2011 00:20:00 3 307
1 12-07-2011 00:25:00 1 400
1 12-07-2011 00:25:00 2 265
1 12-07-2011 00:25:00 3 307
1 12-07-2011 00:30:00 1 400
1 12-07-2011 00:30:00 2 265
1 12-07-2011 00:30:00 3 307
1 12-07-2011 00:35:00 1 400
1 12-07-2011 00:35:00 2 265
1 12-07-2011 00:35:00 3 307
1 12-07-2011 00:40:00 1 400
1 12-07-2011 00:40:00 2 265
1 12-07-2011 00:40:00 3 307
1 12-07-2011 00:45:00 1 400
1 12-07-2011 00:45:00 2 265
1 12-07-2011 00:45:00 3 307
1 12-07-2011 00:50:00 1 400
1 12-07-2011 00:50:00 2 265
1 12-07-2011 00:50:00 3 307
1 12-07-2011 00:55:00 1 400
1 12-07-2011 00:55:00 2 265
1 12-07-2011 00:55:00 3 307
2 12-07-2011 00:00:00 1 778
2 12-07-2011 00:00:00 2 302
2 12-07-2011 00:00:00 3 959
2 12-07-2011 00:05:00 1 778
2 12-07-2011 00:05:00 2 302
2 12-07-2011 00:05:00 3 959
2 12-07-2011 00:10:00 1 778
2 12-07-2011 00:10:00 2 302
2 12-07-2011 00:10:00 3 959
2 12-07-2011 00:15:00 1 778
2 12-07-2011 00:15:00 2 302
2 12-07-2011 00:15:00 3 959
2 12-07-2011 00:20:00 1 778
2 12-07-2011 00:20:00 2 302
2 12-07-2011 00:20:00 3 959
2 12-07-2011 00:25:00 1 778
2 12-07-2011 00:25:00 2 302
2 12-07-2011 00:25:00 3 959
2 12-07-2011 00:30:00 1 778
2 12-07-2011 00:30:00 2 302
2 12-07-2011 00:30:00 3 959
2 12-07-2011 00:35:00 1 778
2 12-07-2011 00:35:00 2 302
2 12-07-2011 00:35:00 3 959
2 12-07-2011 00:40:00 1 778
2 12-07-2011 00:40:00 2 302
2 12-07-2011 00:40:00 3 959
2 12-07-2011 00:45:00 1 778
2 12-07-2011 00:45:00 2 302
2 12-07-2011 00:45:00 3 959
2 12-07-2011 00:50:00 1 778
2 12-07-2011 00:50:00 2 302
2 12-07-2011 00:50:00 3 959
2 12-07-2011 00:55:00 1 778
2 12-07-2011 00:55:00 2 302
2 12-07-2011 00:55:00 3 959
72 rows selected.
The query returns 72 rows as expected, but look at the volume column. When you make up new rows, those new rows should have a null value. And there's more: the rows that existed and weren't made up, now have a volume equaling the volume of "00:00:00" and not their own value...
You might ask: why did you create 108 rows? That's because I've found that when you use 107 rows or less, the bug doesn't appear. With 108 rows or more it creeps up. Here is the same sequence with a table containing 107 rows:
SQL> drop table tradevolumes purge
2 /
Table dropped.
SQL> create table tradevolumes
2 ( timeseries_id number(4)
3 , tradedate date
4 , source_id number(4)
5 , volume number(4)
6 , constraint tradevolumes_pk primary key (timeseries_id,tradedate,source_id)
7 ) organization index
8 /
Table created.
SQL> insert into tradevolumes
2 select 1 + mod(level-1,2)
3 , trunc(sysdate) + numtodsinterval(5*mod(trunc((level-1)/2),107),'minute')
4 , ceil(level/(2 * 107))
5 , ceil(dbms_random.value(0,1000))
6 from dual
7 connect by level <= 2 * 107 * 3
8 /
642 rows created.
SQL> delete tradevolumes
2 where ( tradedate
3 between trunc(sysdate) + interval '35' minute
4 and trunc(sysdate) + interval '50' minute
5 or (timeseries_id = 1 and source_id = 3 and volume < 30)
6 or (timeseries_id = 2 and source_id = 2 and tradedate > trunc(sysdate) + interval '20' minute)
7 )
8 /
128 rows deleted.
SQL> exec dbms_stats.gather_table_stats(user,'tradevolumes')
PL/SQL procedure successfully completed.
SQL> select *
2 from tradevolumes
3 order by timeseries_id
4 , tradedate
5 , source_id
6 /
TIMESERIES_ID TRADEDATE SOURCE_ID VOLUME
------------- ------------------- ---------- ----------
1 12-07-2011 00:00:00 1 159
1 12-07-2011 00:00:00 2 333
1 12-07-2011 00:00:00 3 696
1 12-07-2011 00:05:00 1 674
1 12-07-2011 00:05:00 2 932
... skipped a lot of rows here ...
2 12-07-2011 08:45:00 1 136
2 12-07-2011 08:45:00 3 618
2 12-07-2011 08:50:00 1 705
2 12-07-2011 08:50:00 3 106
514 rows selected.
SQL> with all_tradedates as
2 ( select trunc(sysdate) + numtodsinterval(5*(level-1),'minute') tradedate
3 from dual
4 connect by level <= 12
5 )
6 select v.timeseries_id
7 , a.tradedate
8 , v.source_id
9 , v.volume
10 from all_tradedates a
11 left outer join tradevolumes v
12 partition by (v.timeseries_id,v.source_id)
13 on (a.tradedate = v.tradedate)
14 order by v.timeseries_id
15 , a.tradedate
16 , v.source_id
17 /
TIMESERIES_ID TRADEDATE SOURCE_ID VOLUME
------------- ------------------- ---------- ----------
1 12-07-2011 00:00:00 1 159
1 12-07-2011 00:00:00 2 333
1 12-07-2011 00:00:00 3 696
1 12-07-2011 00:05:00 1 674
1 12-07-2011 00:05:00 2 932
1 12-07-2011 00:05:00 3
1 12-07-2011 00:10:00 1 315
1 12-07-2011 00:10:00 2 872
1 12-07-2011 00:10:00 3 778
1 12-07-2011 00:15:00 1 398
1 12-07-2011 00:15:00 2 272
1 12-07-2011 00:15:00 3 239
1 12-07-2011 00:20:00 1 546
1 12-07-2011 00:20:00 2 863
1 12-07-2011 00:20:00 3 839
1 12-07-2011 00:25:00 1 558
1 12-07-2011 00:25:00 2 576
1 12-07-2011 00:25:00 3 296
1 12-07-2011 00:30:00 1 767
1 12-07-2011 00:30:00 2 957
1 12-07-2011 00:30:00 3 935
1 12-07-2011 00:35:00 1
1 12-07-2011 00:35:00 2
1 12-07-2011 00:35:00 3
1 12-07-2011 00:40:00 1
1 12-07-2011 00:40:00 2
1 12-07-2011 00:40:00 3
1 12-07-2011 00:45:00 1
1 12-07-2011 00:45:00 2
1 12-07-2011 00:45:00 3
1 12-07-2011 00:50:00 1
1 12-07-2011 00:50:00 2
1 12-07-2011 00:50:00 3
1 12-07-2011 00:55:00 1 398
1 12-07-2011 00:55:00 2 110
1 12-07-2011 00:55:00 3 629
2 12-07-2011 00:00:00 1 615
2 12-07-2011 00:00:00 2 924
2 12-07-2011 00:00:00 3 526
2 12-07-2011 00:05:00 1 147
2 12-07-2011 00:05:00 2 589
2 12-07-2011 00:05:00 3 779
2 12-07-2011 00:10:00 1 554
2 12-07-2011 00:10:00 2 108
2 12-07-2011 00:10:00 3 9
2 12-07-2011 00:15:00 1 601
2 12-07-2011 00:15:00 2 111
2 12-07-2011 00:15:00 3 526
2 12-07-2011 00:20:00 1 418
2 12-07-2011 00:20:00 2 818
2 12-07-2011 00:20:00 3 578
2 12-07-2011 00:25:00 1 824
2 12-07-2011 00:25:00 2
2 12-07-2011 00:25:00 3 112
2 12-07-2011 00:30:00 1 649
2 12-07-2011 00:30:00 2
2 12-07-2011 00:30:00 3 689
2 12-07-2011 00:35:00 1
2 12-07-2011 00:35:00 2
2 12-07-2011 00:35:00 3
2 12-07-2011 00:40:00 1
2 12-07-2011 00:40:00 2
2 12-07-2011 00:40:00 3
2 12-07-2011 00:45:00 1
2 12-07-2011 00:45:00 2
2 12-07-2011 00:45:00 3
2 12-07-2011 00:50:00 1
2 12-07-2011 00:50:00 2
2 12-07-2011 00:50:00 3
2 12-07-2011 00:55:00 1 651
2 12-07-2011 00:55:00 2
2 12-07-2011 00:55:00 3 641
72 rows selected.
And here the results are good. And when I add the rows for 08:55:00 afterwards:
SQL> insert into tradevolumes
2 select timeseries_id
3 , tradedate + interval '5' minute
4 , source_id
5 , 1000
6 from tradevolumes
7 where tradedate = (select max(tradedate) from tradevolumes)
8 /
4 rows created.
SQL> commit
2 /
Commit complete.
SQL> exec dbms_stats.gather_table_stats(user,'tradevolumes')
PL/SQL procedure successfully completed.
SQL> with all_tradedates as
2 ( select trunc(sysdate) + numtodsinterval(5*(level-1),'minute') tradedate
3 from dual
4 connect by level <= 12
5 )
6 select v.timeseries_id
7 , a.tradedate
8 , v.source_id
9 , v.volume
10 from all_tradedates a
11 left outer join tradevolumes v
12 partition by (v.timeseries_id,v.source_id)
13 on (a.tradedate = v.tradedate)
14 order by v.timeseries_id
15 , a.tradedate
16 , v.source_id
17 /
TIMESERIES_ID TRADEDATE SOURCE_ID VOLUME
------------- ------------------- ---------- ----------
1 12-07-2011 00:00:00 1 159
1 12-07-2011 00:00:00 2 333
1 12-07-2011 00:00:00 3 696
1 12-07-2011 00:05:00 1 674
1 12-07-2011 00:05:00 2 932
1 12-07-2011 00:05:00 3
1 12-07-2011 00:10:00 1 315
1 12-07-2011 00:10:00 2 872
1 12-07-2011 00:10:00 3 778
1 12-07-2011 00:15:00 1 398
1 12-07-2011 00:15:00 2 272
1 12-07-2011 00:15:00 3 239
1 12-07-2011 00:20:00 1 546
1 12-07-2011 00:20:00 2 863
1 12-07-2011 00:20:00 3 839
1 12-07-2011 00:25:00 1 558
1 12-07-2011 00:25:00 2 576
1 12-07-2011 00:25:00 3 296
1 12-07-2011 00:30:00 1 767
1 12-07-2011 00:30:00 2 957
1 12-07-2011 00:30:00 3 935
1 12-07-2011 00:35:00 1
1 12-07-2011 00:35:00 2
1 12-07-2011 00:35:00 3
1 12-07-2011 00:40:00 1
1 12-07-2011 00:40:00 2
1 12-07-2011 00:40:00 3
1 12-07-2011 00:45:00 1
1 12-07-2011 00:45:00 2
1 12-07-2011 00:45:00 3
1 12-07-2011 00:50:00 1
1 12-07-2011 00:50:00 2
1 12-07-2011 00:50:00 3
1 12-07-2011 00:55:00 1 398
1 12-07-2011 00:55:00 2 110
1 12-07-2011 00:55:00 3 629
2 12-07-2011 00:00:00 1 615
2 12-07-2011 00:00:00 2 924
2 12-07-2011 00:00:00 3 526
2 12-07-2011 00:05:00 1 147
2 12-07-2011 00:05:00 2 589
2 12-07-2011 00:05:00 3 779
2 12-07-2011 00:10:00 1 554
2 12-07-2011 00:10:00 2 108
2 12-07-2011 00:10:00 3 9
2 12-07-2011 00:15:00 1 601
2 12-07-2011 00:15:00 2 111
2 12-07-2011 00:15:00 3 526
2 12-07-2011 00:20:00 1 418
2 12-07-2011 00:20:00 2 818
2 12-07-2011 00:20:00 3 578
2 12-07-2011 00:25:00 1 824
2 12-07-2011 00:25:00 2
2 12-07-2011 00:25:00 3 112
2 12-07-2011 00:30:00 1 649
2 12-07-2011 00:30:00 2
2 12-07-2011 00:30:00 3 689
2 12-07-2011 00:35:00 1
2 12-07-2011 00:35:00 2
2 12-07-2011 00:35:00 3
2 12-07-2011 00:40:00 1
2 12-07-2011 00:40:00 2
2 12-07-2011 00:40:00 3
2 12-07-2011 00:45:00 1
2 12-07-2011 00:45:00 2
2 12-07-2011 00:45:00 3
2 12-07-2011 00:50:00 1
2 12-07-2011 00:50:00 2
2 12-07-2011 00:50:00 3
2 12-07-2011 00:55:00 1 651
2 12-07-2011 00:55:00 2
2 12-07-2011 00:55:00 3 641
72 rows selected.
And now it works!
And using 108 rows with a regular heap table also isn't a problem:
SQL> drop table tradevolumes purge
2 /
Table dropped.
SQL> create table tradevolumes
2 ( timeseries_id number(4)
3 , tradedate date
4 , source_id number(4)
5 , volume number(4)
6 , constraint tradevolumes_pk primary key (timeseries_id,tradedate,source_id)
7 )
8 /
Table created.
SQL> insert into tradevolumes
2 select 1 + mod(level-1,2)
3 , trunc(sysdate) + numtodsinterval(5*mod(trunc((level-1)/2),108),'minute')
4 , ceil(level/(2 * 108))
5 , ceil(dbms_random.value(0,1000))
6 from dual
7 connect by level <= 2 * 108 * 3
8 /
648 rows created.
SQL> delete tradevolumes
2 where ( tradedate
3 between trunc(sysdate) + interval '35' minute
4 and trunc(sysdate) + interval '50' minute
5 or (timeseries_id = 1 and source_id = 3 and volume < 30)
6 or (timeseries_id = 2 and source_id = 2 and tradedate > trunc(sysdate) + interval '20' minute)
7 )
8 /
126 rows deleted.
SQL> exec dbms_stats.gather_table_stats(user,'tradevolumes')
PL/SQL procedure successfully completed.
SQL> select *
2 from tradevolumes
3 order by timeseries_id
4 , tradedate
5 , source_id
6 /
TIMESERIES_ID TRADEDATE SOURCE_ID VOLUME
------------- ------------------- ---------- ----------
1 12-07-2011 00:00:00 1 200
1 12-07-2011 00:00:00 2 157
1 12-07-2011 00:00:00 3 950
1 12-07-2011 00:05:00 1 707
1 12-07-2011 00:05:00 2 932
1 12-07-2011 00:05:00 3 258
... skipped a lot of rows here ...
2 12-07-2011 08:50:00 1 61
2 12-07-2011 08:50:00 3 974
2 12-07-2011 08:55:00 1 692
2 12-07-2011 08:55:00 3 714
522 rows selected.
SQL> with all_tradedates as
2 ( select trunc(sysdate) + numtodsinterval(5*(level-1),'minute') tradedate
3 from dual
4 connect by level <= 12
5 )
6 select v.timeseries_id
7 , a.tradedate
8 , v.source_id
9 , v.volume
10 from all_tradedates a
11 left outer join tradevolumes v
12 partition by (v.timeseries_id,v.source_id)
13 on (a.tradedate = v.tradedate)
14 order by v.timeseries_id
15 , a.tradedate
16 , v.source_id
17 /
TIMESERIES_ID TRADEDATE SOURCE_ID VOLUME
------------- ------------------- ---------- ----------
1 12-07-2011 00:00:00 1 200
1 12-07-2011 00:00:00 2 157
1 12-07-2011 00:00:00 3 950
1 12-07-2011 00:05:00 1 707
1 12-07-2011 00:05:00 2 932
1 12-07-2011 00:05:00 3 258
1 12-07-2011 00:10:00 1 586
1 12-07-2011 00:10:00 2 193
1 12-07-2011 00:10:00 3 753
1 12-07-2011 00:15:00 1 287
1 12-07-2011 00:15:00 2 420
1 12-07-2011 00:15:00 3 113
1 12-07-2011 00:20:00 1 971
1 12-07-2011 00:20:00 2 93
1 12-07-2011 00:20:00 3 967
1 12-07-2011 00:25:00 1 685
1 12-07-2011 00:25:00 2 63
1 12-07-2011 00:25:00 3 832
1 12-07-2011 00:30:00 1 471
1 12-07-2011 00:30:00 2 764
1 12-07-2011 00:30:00 3 277
1 12-07-2011 00:35:00 1
1 12-07-2011 00:35:00 2
1 12-07-2011 00:35:00 3
1 12-07-2011 00:40:00 1
1 12-07-2011 00:40:00 2
1 12-07-2011 00:40:00 3
1 12-07-2011 00:45:00 1
1 12-07-2011 00:45:00 2
1 12-07-2011 00:45:00 3
1 12-07-2011 00:50:00 1
1 12-07-2011 00:50:00 2
1 12-07-2011 00:50:00 3
1 12-07-2011 00:55:00 1 332
1 12-07-2011 00:55:00 2 909
1 12-07-2011 00:55:00 3 873
2 12-07-2011 00:00:00 1 865
2 12-07-2011 00:00:00 2 982
2 12-07-2011 00:00:00 3 290
2 12-07-2011 00:05:00 1 936
2 12-07-2011 00:05:00 2 651
2 12-07-2011 00:05:00 3 141
2 12-07-2011 00:10:00 1 843
2 12-07-2011 00:10:00 2 818
2 12-07-2011 00:10:00 3 218
2 12-07-2011 00:15:00 1 85
2 12-07-2011 00:15:00 2 530
2 12-07-2011 00:15:00 3 650
2 12-07-2011 00:20:00 1 475
2 12-07-2011 00:20:00 2 988
2 12-07-2011 00:20:00 3 771
2 12-07-2011 00:25:00 1 422
2 12-07-2011 00:25:00 2
2 12-07-2011 00:25:00 3 48
2 12-07-2011 00:30:00 1 88
2 12-07-2011 00:30:00 2
2 12-07-2011 00:30:00 3 991
2 12-07-2011 00:35:00 1
2 12-07-2011 00:35:00 2
2 12-07-2011 00:35:00 3
2 12-07-2011 00:40:00 1
2 12-07-2011 00:40:00 2
2 12-07-2011 00:40:00 3
2 12-07-2011 00:45:00 1
2 12-07-2011 00:45:00 2
2 12-07-2011 00:45:00 3
2 12-07-2011 00:50:00 1
2 12-07-2011 00:50:00 2
2 12-07-2011 00:50:00 3
2 12-07-2011 00:55:00 1 793
2 12-07-2011 00:55:00 2
2 12-07-2011 00:55:00 3 601
72 rows selected.
The above was all executed in version 11.2.0.2 and I've also tried it on 10.2.0.4 and I witnessed exactly the same phenomenon there. The only difference is that the boundary value is not 107/108 but 80/81.
I did some more basic investigations regarding space occupation and statistics, but I haven't found a clue so far. I'm wondering if I'm missing something obvious. So, if you have some more information about affected versions or the cause, I'd like to know.
Posted by
Rob van Wijk
at
10:00 PM
4
comments
Monday, July 11, 2011
Tutorials zur SQL Model Clause
Marcus Matzberger has translated five of my posts related to the SQL Model Clause for the German speaking Oracle community. Those posts now contain links to the German versions and I've listed them here for convenience as well.
SQL Model Clause Tutorial Part One
SQL Model Clause Tutorial Part Two
SQL Model Clause Tutorial Part Three
Calculating probabilities with N throws of a die
Choosing Between SQL and PL/SQL
Thanks Marcus!
Posted by
Rob van Wijk
at
9:59 PM
0
comments
Saturday, June 11, 2011
Back from Tallinn, Estonia
This morning I arrived back from a trip to Tallinn. Oracle Estonia had given me the opportunity to present my SQL Masterclass seminar at their training center in Tallinn, on Thursday and Friday. Thank you all those who spent two days hearing me. Here is a short story about my trip including some photos.
I arrived at Tallinn airport around 1PM local time on Wednesday. My hotel room was located at the 18th floor and I had this splendid view on the old city of Tallinn:
I spent the rest of the afternoon taking a city tour by bus, learning a few things about the history of Tallinn and strolling for an hour or two through the old city. Here are a few of the photos that I took during that stroll:
The seminar itself was on Thursday and Friday. I had spent a huge amount of time preparing the seminar. One of my goals was to present as many recognizable situations as possible that the attendees can apply at their daily job, so I was very pleased to receive some feedback that they indeed found several immediately applicable queries. There is still room for improvement though as a few parts were too short (I had planned one-and-a-half hour per part) and I had an error or two in my scripts that I had overlooked. Hopefully I will get a chance to fix those and present my seminar again somewhere. The feeling at the end was very positive though. The powerpoint and the scripts can be found at the "Presentations and papers" page.
This is a picture of the building where Oracle Estonia and its training center are located:
PS: As expected, during the preparation of the seminar I learnt a couple of new things. A bug with partitioned outer joins and something about connect-by-filtering. I will blog about them soon.
Posted by
Rob van Wijk
at
11:22 PM
0
comments