About Oracle

Wednesday, March 28, 2012

Mastering Oracle Trace Data with Cary Millsap

At CIBER we are very proud to announce that Cary Millsap will give his one day seminar called Mastering Oracle Trace Data in the Netherlands. The event will take place at the Carlton President Hotel at Utrecht on Wednesday, May 23. You can register and read more about this event here.

The seminar is aimed at DBA's, database application developers, data warehouse specialists and anyone caring about the speed of a database application. I'm sure that, among many other things, you'll leave the seminar with a very clear mindset about performance. If you are in doubt whether you should come or not, please read one of his many excellent papers, Thinking Clearly About Performance.

It's an opportunity you don't want to miss. Hope to see you there soon!

Tuesday, March 27, 2012

Third OGh APEX dag

Yesterday was the third annual APEX day, organized by the Dutch Oracle usergroup OGh. It's the biggest APEX only event in the world, I've been told, with approximately 280 attendees. Learco Brizzi, Marti Koppelmans and myself were very proud to again have a great lineup of presenters and presentations.

The day started with a keynote by Patrick Wolf telling about and showing a lot of new 4.2 features. Of course, he could not be sure if every single feature will eventually make it to this 4.2 release. The APEX team focussed on mobile development. I was mostly impressed by the demo showing how relatively easy it will be to develop a great looking iPhone application. A query, select a list type and voila. And I think this is why the Oracle community likes APEX so much: it makes developing applications easy. It reminded me of this presentation by Kathy Sierra: it makes you feel awesome.

Next up were the three parallel tracks. I heard quite a lot of people saying they were having a hard time making choices between great presentations, and I was no exception to that rule.

I saw Roel Hartman's "5 Cool Things you can do with HTML 5". Great presentation, especially the first real slide :-), loved the websockets demo with beaconpush, and I learned a lot.

Next presentation was John Scott's "APEX 4 - Error Handling enhancements". Having seen quite a number of John's presentations in the past, I know he always takes a subject to the next level with last years translation plugin as a highlight. This time he showed how easy it is to automatically log any errors in JIRA, so users don't have to report errors anymore: you already know it. Unfortunately, his last demo didn't work. He wanted to show how to a add a screenshot from the users browser automatically to the JIRA ticket.

The fourth presentation was by Sergei Martens called "Building desktop-like web applications with Ext JS & Apex". He was very enthusiastic about APEX applications using Ext JS, but he didn't succeed completely in explaining to me what aspects he liked so much. I mean, I agree some of the builtin themes are quite boring, but I don't think the video showed a "sexy" theme. To me, a great design is a simple design, but that's a matter of personal preference. Next, he had some severe troubles with a very slow Windows (a pleonasm?) laptop. This must be a presenters' nightmare, especially since he seemed well prepared, but this was out of his control. He handled the situation remarkably well though.

The last presentation of the day was Margreet den Hartigh's and Alex Nuijten's presentation called "From Zero to APEX". A customer story about a Uniface application that was rebuild using APEX, with a team of Uniface developers that not only had to learn APEX, but also PL/SQL, HTML, CSS, Subversion, Javascript and a lot more. Almost all Dutch municipalities use this application. It looks like they can win a lot more by integrating the databases of all municipalities to a single hosted database, with some partitioning, VPD and resource manager. The cost of ownership will decrease dramatically that way. Free tip from me :-)

A nice dinner with several visitors and presenters ended a great APEX day.

Sunday, March 25, 2012

Connect By Filtering

A hierarchical query is typically executed using a plan that starts with the operation CONNECT BY WITH FILTERING, which has two child operations. The first child operation implements the START WITH clause and the second child operation contains a step called CONNECT BY PUMP, implementing the recursive part of the query. Here is an example of such a plan using the well known hierarchical query on table EMP:

SQL>  select lpad(' ', 2 * level - 2, ' ') || ename as ename
  2        , level
  3        , job
  4        , deptno
  5     from emp
  6  connect by mgr = prior empno
  7    start with mgr is null
  8  /

ENAME                     LEVEL JOB                             DEPTNO
-------------------- ---------- --------------------------- ----------
KING                          1 PRESIDENT                           10
  JONES                       2 MANAGER                             20
    SCOTT                     3 ANALYST                             20
      ADAMS                   4 CLERK                               20
    FORD                      3 ANALYST                             20
      SMITH                   4 CLERK                               20
  BLAKE                       2 MANAGER                             30
    ALLEN                     3 SALESMAN                            30
    WARD                      3 SALESMAN                            30
    MARTIN                    3 SALESMAN                            30
    TURNER                    3 SALESMAN                            30
    JAMES                     3 CLERK                               30
  CLARK                       2 MANAGER                             10
    MILLER                    3 CLERK                               10

14 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d2c7xqxbr112u, child number 0
-------------------------------------
 select lpad(' ', 2 * level - 2, ' ') || ename as ename       , level       , job       , deptno    from emp connect by
mgr = prior empno   start with mgr is null

Plan hash value: 1869448388

--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------
|*  1 |  CONNECT BY WITH FILTERING|      |      1 |        |     14 |00:00:00.02 |      15 |      6 |  2048 |  2048 | 2048  (0)|
|*  2 |   TABLE ACCESS FULL       | EMP  |      1 |      1 |      1 |00:00:00.01 |       3 |      6 |       |       |          |
|*  3 |   HASH JOIN               |      |      4 |        |     13 |00:00:00.01 |      12 |      0 |  1452K|  1452K|  853K (0)|
|   4 |    CONNECT BY PUMP        |      |      4 |        |     14 |00:00:00.01 |       0 |      0 |       |       |          |
|   5 |    TABLE ACCESS FULL      | EMP  |      4 |      2 |     56 |00:00:00.01 |      12 |      0 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("MGR"=PRIOR NULL)
   2 - filter("MGR" IS NULL)
   3 - access("MGR"=PRIOR NULL)


24 rows selected.

You can see a great and more detailed explanation of connect by with filtering here on Christian Antognini's blog.

When I was researching the new recursive subquery factoring clause one and a half year ago, and compared a standard hierarchical query on EMP using recursive subquery factoring with a query using the good old connect by, I stumbled upon a new optimizer algorithm for implementing recursive queries:

SQL>  select lpad(' ', 2 * level - 2, ' ') || ename as ename
  2        , level
  3        , job
  4        , deptno
  5     from emp
  6  connect by mgr = prior empno
  7    start with mgr is null
  8  /

ENAME                     LEVEL JOB           DEPTNO
-------------------- ---------- --------- ----------
KING                          1 PRESIDENT         10
  JONES                       2 MANAGER           20
    SCOTT                     3 ANALYST           20
      ADAMS                   4 CLERK             20
    FORD                      3 ANALYST           20
      SMITH                   4 CLERK             20
  BLAKE                       2 MANAGER           30
    ALLEN                     3 SALESMAN          30
    WARD                      3 SALESMAN          30
    MARTIN                    3 SALESMAN          30
    TURNER                    3 SALESMAN          30
    JAMES                     3 CLERK             30
  CLARK                       2 MANAGER           10
    MILLER                    3 CLERK             10

14 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d2c7xqxbr112u, child number 0
-------------------------------------
 select lpad(' ', 2 * level - 2, ' ') || ename as ename       , level
    , job       , deptno    from emp connect by mgr = prior empno
start with mgr is null

Plan hash value: 763482334

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |      |      1 |        |     14 |00:00:00.02 |       6 |      6 |
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|      |      1 |        |     14 |00:00:00.02 |       6 |      6 |
|   2 |   TABLE ACCESS FULL                     | EMP  |      1 |     14 |     14 |00:00:00.02 |       6 |      6 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("MGR"=PRIOR NULL)
       filter("MGR" IS NULL)


22 rows selected.

You might wonder what I did to make two exactly the same queries to use a different execution plan, but I'll address that later. First, I'd like to show there are two optimizer hints available, with which you can control which algorithm the optimizer uses:

SQL> select *
  2    from v$sql_hint
  3   where name like '%CONNECT_BY_FILTERING%'
  4  /

NAME                    SQL_FEATURE  CLASS
----------------------- ------------ -----------------------
INVERSE                 TARGET_LEVEL   PROPERTY VERSION    VERSION_OUTLINE
----------------------- ------------ ---------- ---------- ---------------
CONNECT_BY_FILTERING    QKSFM_ALL    CONNECT_BY_FILTERING
NO_CONNECT_BY_FILTERING            2         16 10.2.0.2   10.2.0.2

NO_CONNECT_BY_FILTERING QKSFM_ALL    CONNECT_BY_FILTERING
CONNECT_BY_FILTERING               2         16 10.2.0.2   10.2.0.2


2 rows selected.

And this was surprising to me. As the version column suggests, the no_connect_by_filtering hint and the accompanying new algorithm were already introduced in version 10.2.0.2! I checked with my old 10.2.0.4 database and it is indeed present and can be used there:

SQL> select version
  2    from v$instance
  3  /

VERSION
---------------------------------------------------
10.2.0.4.0

1 row selected.

SQL>  select /*+ no_connect_by_filtering gather_plan_statistics */
  2          lpad(' ', 2 * level - 2, ' ') || ename as ename
  3        , level
  4        , job
  5        , deptno
  6     from emp
  7  connect by mgr = prior empno
  8    start with mgr is null
  9  /

ENAME                     LEVEL JOB                             DEPTNO
-------------------- ---------- --------------------------- ----------
KING                          1 PRESIDENT                           10
  JONES                       2 MANAGER                             20
    SCOTT                     3 ANALYST                             20
      ADAMS                   4 CLERK                               20
    FORD                      3 ANALYST                             20
      SMITH                   4 CLERK                               20
  BLAKE                       2 MANAGER                             30
    ALLEN                     3 SALESMAN                            30
    WARD                      3 SALESMAN                            30
    MARTIN                    3 SALESMAN                            30
    TURNER                    3 SALESMAN                            30
    JAMES                     3 CLERK                               30
  CLARK                       2 MANAGER                             10
    MILLER                    3 CLERK                               10

14 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  39kr5s8dxz7j0, child number 0
-------------------------------------
 select /*+ no_connect_by_filtering gather_plan_statistics */         lpad(' ', 2 * level - 2, '
') || ename as ename       , level       , job       , deptno    from emp connect by mgr = prior
empno   start with mgr is null

Plan hash value: 763482334

----------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|      |      1 |        |     14 |00:00:00.01 |       3 |
|   2 |   TABLE ACCESS FULL                     | EMP  |      1 |     14 |     14 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("MGR"=PRIOR NULL)
       filter("MGR" IS NULL)


21 rows selected.

But you need the no_connect_by_filtering hint in version 10.2.0.4 for this query. If you do not provide the hint, this is the result:

SQL>  select /*+ gather_plan_statistics */
  2          lpad(' ', 2 * level - 2, ' ') || ename as ename
  3        , level
  4        , job
  5        , deptno
  6     from emp
  7  connect by mgr = prior empno
  8    start with mgr is null
  9  /

ENAME                     LEVEL JOB                             DEPTNO
-------------------- ---------- --------------------------- ----------
KING                          1 PRESIDENT                           10
  JONES                       2 MANAGER                             20
    SCOTT                     3 ANALYST                             20
      ADAMS                   4 CLERK                               20
    FORD                      3 ANALYST                             20
      SMITH                   4 CLERK                               20
  BLAKE                       2 MANAGER                             30
    ALLEN                     3 SALESMAN                            30
    WARD                      3 SALESMAN                            30
    MARTIN                    3 SALESMAN                            30
    TURNER                    3 SALESMAN                            30
    JAMES                     3 CLERK                               30
  CLARK                       2 MANAGER                             10
    MILLER                    3 CLERK                               10

14 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  6zhtnf720u0bm, child number 0
-------------------------------------
 select /*+ gather_plan_statistics */         lpad(' ', 2 * level - 2, ' ') || ename as ename       , level
   , job       , deptno    from emp connect by mgr = prior empno   start with mgr is null

Plan hash value: 1869448388

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------
|*  1 |  CONNECT BY WITH FILTERING|      |      1 |        |     14 |00:00:00.01 |      15 |  2048 |  2048 | 2048  (0)|
|*  2 |   TABLE ACCESS FULL       | EMP  |      1 |      1 |      1 |00:00:00.01 |       3 |       |       |          |
|*  3 |   HASH JOIN               |      |      4 |        |     13 |00:00:00.01 |      12 |  1452K|  1452K|  843K (0)|
|   4 |    CONNECT BY PUMP        |      |      4 |        |     14 |00:00:00.01 |       0 |       |       |          |
|   5 |    TABLE ACCESS FULL      | EMP  |      4 |      2 |     56 |00:00:00.01 |      12 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("MGR"=PRIOR NULL)
   2 - filter("MGR" IS NULL)
   3 - access("MGR"=PRIOR NULL)


24 rows selected.

Which explains why I didn't see the CONNECT BY NO FILTERING WITH START-WITH earlier. It seems that Oracle has adjusted the cost calculation of connect by queries somewhere between 10.2.0.4 and 11.2.0.1. Just look at the cost from both execution plans on 10.2.0.4 using a regular explain plan statement and a "select * from table(dbms_xplan.display):

----------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      |     2 |    50 |     3   (0)| 00:00:01 |
|*  1 |  CONNECT BY WITH FILTERING|      |       |       |            |          |
|*  2 |   TABLE ACCESS FULL       | EMP  |     1 |    29 |     3   (0)| 00:00:01 |
|*  3 |   HASH JOIN               |      |       |       |            |          |
|   4 |    CONNECT BY PUMP        |      |       |       |            |          |
|   5 |    TABLE ACCESS FULL      | EMP  |     2 |    50 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |      |    14 |   350 |     3   (0)| 00:00:01 |
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|      |       |       |            |          |
|   2 |   TABLE ACCESS FULL                     | EMP  |    14 |   350 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

The cost of 3 is due to the full table scan of EMP, and no additional cost is added for the hierarchical query.

These are the plans from 11.2.0.2:

----------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      |     3 |   156 |    15  (20)| 00:00:01 |
|*  1 |  CONNECT BY WITH FILTERING|      |       |       |            |          |
|*  2 |   TABLE ACCESS FULL       | EMP  |     1 |    25 |     4   (0)| 00:00:01 |
|*  3 |   HASH JOIN               |      |     2 |    76 |     9  (12)| 00:00:01 |
|   4 |    CONNECT BY PUMP        |      |       |       |            |          |
|*  5 |    TABLE ACCESS FULL      | EMP  |    13 |   325 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |      |    14 |   728 |     5  (20)| 00:00:01 |
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|      |       |       |            |          |
|   2 |   TABLE ACCESS FULL                     | EMP  |    14 |   350 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

The numbers from the 11.2.0.2 show more sophistication than just the cost of the table scan. The optimizer can't know how many levels deep the data is, but version 10.2.0.4 apparently picked 1, and left the total cost unchanged from 3 to 3. I'm curious to know in which version in between 10.2.0.4 and 11.2.0.2 this cost calculation changed. If anyone who is reading this, has a version in between and likes to check, please let me know in the comments. My guess would be that 11.2.0.1 contained the cost change.

What does CONNECT BY NO FILTERING WITH START-WITH do?

Let's explore this, using this table:

SQL> create table t (id, parent_id, value, indicator)
  2  as
  3   select level - 1
  4        , case level when 1 then null else trunc((level-1)/10) end
  5        , round(dbms_random.value * 1000)
  6        , case mod(level,10) when 4 then 'N' else 'Y' end
  7     from dual
  8  connect by level <= 100000
  9  /

Table created.

SQL> alter table t
  2    add constraint cbt_pk
  3    primary key (id)
  4  /

Table altered.

SQL> create index i1 on t (parent_id,indicator)
  2  /

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true)

The data is tree shaped where each parent node has exactly 9 child nodes. One tenth of the data, with an id that ends with the digit 3, has its indicator column set to 'N'. This select query will make it clearer how the data looks like:

SQL> select *
  2    from t
  3   where id < 24 or id > 99997
  4   order by id
  5  /

        ID  PARENT_ID      VALUE I
---------- ---------- ---------- -
         0                   656 Y
         1          0        289 Y
         2          0        365 Y
         3          0        644 N
         4          0        364 Y
         5          0        841 Y
         6          0        275 Y
         7          0        529 Y
         8          0        500 Y
         9          0        422 Y
        10          1        598 Y
        11          1        104 Y
        12          1        467 Y
        13          1        296 N
        14          1        105 Y
        15          1        220 Y
        16          1        692 Y
        17          1        793 Y
        18          1         29 Y
        19          1        304 Y
        20          2        467 Y
        21          2        716 Y
        22          2        837 Y
        23          2        432 N
     99998       9999        609 Y
     99999       9999         24 Y

26 rows selected.

When hearing the word "filter", I almost immediately associate it with a WHERE clause. But a where clause in a connect by query, is not what is meant by connect by filtering. The documentation states:

Oracle processes hierarchical queries as follows:

A join, if present, is evaluated first, whether the join is specified in the FROM clause or with WHERE clause predicates.

The CONNECT BY condition is evaluated.

Any remaining WHERE clause predicates are evaluated.

So a where clause predicate is evaluated AFTER the connect by has done its job. You can see that happening here:

SQL> explain plan
  2  for
  3   select id
  4        , parent_id
  5        , sys_connect_by_path(id,'->') scbp
  6     from t
  7    where indicator = 'N'
  8  connect by parent_id = prior id
  9    start with parent_id is null
 10  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2502271019

---------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |    11 |   319 |   164   (3)| 00:00:02 |
|*  1 |  FILTER                        |      |       |       |            |          |
|*  2 |   CONNECT BY WITH FILTERING    |      |       |       |            |          |
|*  3 |    TABLE ACCESS FULL           | T    |     1 |    11 |    80   (2)| 00:00:01 |
|   4 |    NESTED LOOPS                |      |    10 |   240 |    82   (2)| 00:00:01 |
|   5 |     CONNECT BY PUMP            |      |       |       |            |          |
|   6 |     TABLE ACCESS BY INDEX ROWID| T    |    10 |   110 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN          | I1   |    10 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("INDICATOR"='N')
   2 - access("PARENT_ID"=PRIOR "ID")
   3 - filter("PARENT_ID" IS NULL)
   7 - access("PARENT_ID"="connect$_by$_pump$_002"."prior id   ")

22 rows selected.

The "indicator = 'N'" predicate is at step 1, which is executed after the CONNECT BY WITH FILTERING at step 2. Note that although this query is executed in 11.2.0.2, the optimizer has chosen the old CONNECT BY WITH FILTERING.

Connect by filtering is done by using filters in your CONNECT BY clause. Here is an example using the predicate "indicator = 'N'" inside the CONNECT BY clause:

SQL>  select id
  2        , parent_id
  3        , sys_connect_by_path(id,'->') scbp
  4     from t
  5  connect by parent_id = prior id
  6      and indicator = 'N'
  7    start with parent_id is null
  8  /

        ID  PARENT_ID SCBP
---------- ---------- --------------------------------------------------
         0            ->0
         3          0 ->0->3
        33          3 ->0->3->33
       333         33 ->0->3->33->333
      3333        333 ->0->3->33->333->3333
     33333       3333 ->0->3->33->333->3333->33333

6 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dzkjzrrzgnvd5, child number 0
-------------------------------------
 select id       , parent_id       , sys_connect_by_path(id,'->') scbp
  from t connect by parent_id = prior id     and indicator = 'N'
start with parent_id is null

Plan hash value: 3164577763

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |      1 |        |      6 |00:00:00.01 |     294 |       |       |          |
|*  1 |  CONNECT BY WITH FILTERING    |      |      1 |        |      6 |00:00:00.01 |     294 |  2048 |  2048 | 2048  (0)|
|*  2 |   TABLE ACCESS FULL           | T    |      1 |      1 |      1 |00:00:00.01 |     277 |       |       |          |
|   3 |   NESTED LOOPS                |      |      6 |      5 |      5 |00:00:00.01 |      17 |       |       |          |
|   4 |    CONNECT BY PUMP            |      |      6 |        |      6 |00:00:00.01 |       0 |       |       |          |
|   5 |    TABLE ACCESS BY INDEX ROWID| T    |      6 |      5 |      5 |00:00:00.01 |      17 |       |       |          |
|*  6 |     INDEX RANGE SCAN          | I1   |      6 |      5 |      5 |00:00:00.01 |      12 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PARENT_ID"=PRIOR NULL)
   2 - filter("PARENT_ID" IS NULL)
   6 - access("PARENT_ID"="connect$_by$_pump$_002"."prior id     " AND "INDICATOR"='N')


27 rows selected.

In the A-rows column, you can see that the connect by filtering was effective here. Only the necessary rows were being read. And this is the key difference between the two connect by algorithms: with CONNECT BY WITH FILTERING, you can filter within each recursion, whereas CONNECT BY NO FILTERING WITH START-WITH has to read everything, does an in-memory operation, and return the result. With this example, the latter is much less efficient:

SQL>  select /*+ no_connect_by_filtering */ id
  2        , parent_id
  3        , sys_connect_by_path(id,'->') scbp
  4     from t
  5  connect by parent_id = prior id
  6      and indicator = 'N'
  7    start with parent_id is null
  8  /

        ID  PARENT_ID SCBP
---------- ---------- --------------------------------------------------
         0            ->0
         3          0 ->0->3
        33          3 ->0->3->33
       333         33 ->0->3->33->333
      3333        333 ->0->3->33->333->3333
     33333       3333 ->0->3->33->333->3333->33333

6 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3fcr31tp83by9, child number 0
-------------------------------------
 select /*+ no_connect_by_filtering */ id       , parent_id       ,
sys_connect_by_path(id,'->') scbp    from t connect by parent_id =
prior id     and indicator = 'N'   start with parent_id is null

Plan hash value: 2303479083

----------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |      |      1 |        |      6 |00:00:00.14 |     277 |
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|      |      1 |        |      6 |00:00:00.14 |     277 |
|   2 |   TABLE ACCESS FULL                     | T    |      1 |    100K|    100K|00:00:00.01 |     277 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PARENT_ID"=PRIOR NULL)
       filter("PARENT_ID" IS NULL)


22 rows selected.

100K rows were being read, and the A-time was 0.14 seconds instead of 0.01 seconds. I wondered where those 0.14 seconds went to, since the plan shows it's NOT for the full table scan. Using Tom Kyte's runstats_pkg reveals this:

SQL> declare
  2    cursor c1
  3    is
  4     select /*+ connect_by_filtering */ id
  5          , parent_id
  6          , sys_connect_by_path(id,'->') scbp
  7       from t
  8    connect by parent_id = prior id
  9        and indicator = 'N'
 10      start with parent_id is null
 11    ;
 12    cursor c2
 13    is
 14     select /*+ no_connect_by_filtering */ id
 15          , parent_id
 16          , sys_connect_by_path(id,'->') scbp
 17       from t
 18    connect by parent_id = prior id
 19        and indicator = 'N'
 20      start with parent_id is null
 21    ;
 22  begin
 23    runstats_pkg.rs_start;
 24    for r in c1 loop null; end loop;
 25    runstats_pkg.rs_middle;
 26    for r in c2 loop null; end loop;
 27    runstats_pkg.rs_stop;
 28  end;
 29  /
Run1 ran in 0 hsecs
Run2 ran in 10 hsecs
run 1 ran in 0% of the time
 
Name                                  Run1        Run2        Diff
STAT...HSC Heap Segment Block           16          15          -1
STAT...db block changes                 48          47          -1
STAT...consistent gets - exami           9           8          -1
STAT...db block gets from cach          32          33           1
STAT...db block gets                    32          33           1
STAT...redo subscn max counts            0           1           1
STAT...redo ordering marks               0           1           1
STAT...redo entries                     16          15          -1
STAT...calls to kcmgas                   0           1           1
STAT...calls to kcmgcs                  29          28          -1
STAT...free buffer requested             0           1           1
STAT...Heap Segment Array Inse          16          15          -1
STAT...consistent changes               32          31          -1
STAT...heap block compress               9           8          -1
STAT...parse time cpu                    1           0          -1
STAT...buffer is pinned count            1           0          -1
STAT...session cursor cache co           1           0          -1
STAT...sql area evicted                  1           0          -1
LATCH.undo global data                  11          10          -1
LATCH.SQL memory manager worka           3           5           2
LATCH.messages                           0           2           2
LATCH.OS process allocation              0           2           2
LATCH.simulator hash latch              20          23           3
LATCH.object queue header oper           4           1          -3
STAT...workarea executions - o          10           6          -4
STAT...table fetch by rowid             15          10          -5
STAT...index scans kdiixs1               6           0          -6
LATCH.row cache objects                280         274          -6
STAT...sorts (memory)                    8           2          -6
STAT...CPU used by this sessio           2          11           9
STAT...Elapsed Time                      1          11          10
STAT...recursive cpu usage               2          12          10
STAT...no work - consistent re         300         284         -16
STAT...buffer is not pinned co          36          20         -16
STAT...session logical reads           354         337         -17
STAT...consistent gets from ca         313         296         -17
STAT...consistent gets from ca         322         304         -18
LATCH.shared pool                      186         168         -18
STAT...consistent gets                 322         304         -18
LATCH.shared pool simulator             23           4         -19
LATCH.cache buffers chains             785         740         -45
STAT...undo change vector size       3,500       3,420         -80
STAT...redo size                     4,652       4,560         -92
STAT...session uga memory                0     -65,488     -65,488
STAT...session pga memory                0     -65,536     -65,536
STAT...sorts (rows)                     12     100,001      99,989
 
Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
1,467       1,384         -83    106.00%

PL/SQL procedure successfully completed

The major difference is the number of rows sorted! The CONNECT BY NO FILTERING WITH START-WITH sorts all 100K rows. This is a surprise, because normally when you sort, you use memory from the PGA workarea, which shows up in your memory statistics from your execution plan. But the no filtering plan did not show those statistics (OMem, 1Mem, Used-Mem). I have no explanation for this phenomenon yet.

Let's zoom in on the sorting:

SQL> select sn.name
  2       , ms.value
  3    from v$mystat ms
  4       , v$statname sn
  5   where ms.statistic# = sn.statistic#
  6     and sn.name like '%sort%'
  7  /

NAME                         VALUE
----------------------- ----------
sorts (memory)                2278
sorts (disk)                     0
sorts (rows)               9425510

3 rows selected.

SQL>  select id
  2        , parent_id
  3        , sys_connect_by_path(id,'->') scbp
  4     from t
  5  connect by parent_id = prior id
  6      and indicator = 'N'
  7    start with parent_id is null
  8  /

        ID  PARENT_ID SCBP
---------- ---------- --------------------------------------------------
         0            ->0
         3          0 ->0->3
        33          3 ->0->3->33
       333         33 ->0->3->33->333
      3333        333 ->0->3->33->333->3333
     33333       3333 ->0->3->33->333->3333->33333

6 rows selected.

SQL> select sn.name
  2       , ms.value
  3    from v$mystat ms
  4       , v$statname sn
  5   where ms.statistic# = sn.statistic#
  6     and sn.name like '%sort%'
  7  /

NAME                         VALUE
----------------------- ----------
sorts (memory)                2286
sorts (disk)                     0
sorts (rows)               9425522

3 rows selected.

SQL>  select /*+ no_connect_by_filtering */ id
  2        , parent_id
  3        , sys_connect_by_path(id,'->') scbp
  4     from t
  5  connect by parent_id = prior id
  6      and indicator = 'N'
  7    start with parent_id is null
  8  /

        ID  PARENT_ID SCBP
---------- ---------- --------------------------------------------------
         0            ->0
         3          0 ->0->3
        33          3 ->0->3->33
       333         33 ->0->3->33->333
      3333        333 ->0->3->33->333->3333
     33333       3333 ->0->3->33->333->3333->33333

6 rows selected.

SQL> select sn.name
  2       , ms.value
  3    from v$mystat ms
  4       , v$statname sn
  5   where ms.statistic# = sn.statistic#
  6     and sn.name like '%sort%'
  7  /

NAME                         VALUE
----------------------- ----------
sorts (memory)                2288
sorts (disk)                     0
sorts (rows)               9525523

3 rows selected.

So CONNECT BY WITH FILTERING did 8 sorts (2286 - 2278) and sorted 12 rows (9425522 - 9425510), whereas CONNECT BY NO FILTERING WITH START-WITH did 2 (2288 - 2286) sorts and sorted 100,001 rows (9525523 - 9425522).

And finally, I promised to explain why the first two queries of this blogpost are identical, but show a different execution plan. The reason is simple: the first one is executed on 10.2.0.4 and the second one on 11.2.0.2.

Sunday, September 25, 2011

EXISTS

Most of us know the SQL keyword EXISTS as the condition with a subquery you use in a WHERE clause. But if you look at the documentation of EXISTS, you'll see that it says nothing about just using it in a WHERE clause. It's just a sort of function that accepts a query as input and returns "TRUE if a subquery returns at least one row". The SQL language doesn't know about booleans, but it calls these "conditions". And according to another part of the documentation about SQL conditions:

You can use a condition in the WHERE clause of these statements:

DELETE
SELECT
UPDATE

You can use a condition in any of these clauses of the SELECT statement:

WHERE
START WITH
CONNECT BY
HAVING

So you might be tempted to believe it is not possible to use EXISTS in the SELECT clause. However, in a CASE expression, more specifically the searched case expression, it expects a "condition" after the WHEN, as you can see in the syntax wire diagram. Meaning we can put an EXISTS condition after the WHEN.

An example of this came up recently when a colleague asked me for help with his conversion project with a query. His setup resembles this test case:

SQL> create table t1 (id)
  2  as
  3   select level
  4     from dual
  5  connect by level <= 4
  6  /

Table created.

SQL> alter table t1 add constraint t1_pk primary key (id)
  2  /

Table altered.

SQL> exec dbms_stats.gather_table_stats(user,'t1',cascade=>true)

PL/SQL procedure successfully completed.

SQL> create table t2 (id, t1_id, status, filler)
  2  as
  3  select 1, 1, 'w', lpad('*',1000,'*') from dual union all
  4  select 2, 2, 'x', lpad('*',1000,'*') from dual union all
  5  select 3, 2, 'y', lpad('*',1000,'*') from dual union all
  6  select 4, 3, 'w', lpad('*',1000,'*') from dual union all
  7  select 5, 3, 'x', lpad('*',1000,'*') from dual union all
  8  select 6, 3, 'z', lpad('*',1000,'*') from dual union all
  9  select 7, 4, 'w', lpad('*',1000,'*') from dual union all
 10  select 8, 4, 'z', lpad('*',1000,'*') from dual
 11  /

Table created.

SQL> alter table t2 add constraint t2_pk primary key (id)
  2  /

Table altered.

SQL> alter table t2 add constraint t2_t1_fk foreign key (t1_id) references t1(id)
  2  /

Table altered.

SQL> create index t2_i on t2 (status,t1_id)
  2  /

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'t2',cascade=>true)

PL/SQL procedure successfully completed.

Two tables with a master detail relationship. The question for me was to come up with a query that selects all rows of t1 and if there is at least one row in t2 with status x or y, then print a 'yes' in an extra column, and 'no' otherwise.

The first query that I came up with, was to use a simple outer join and remove the duplicates with a distinct.

SQL> select /*+ gather_plan_statistics */
  2         distinct t1.id
  3       , nvl2(t2.id,'yes','no') exists_t2
  4    from t1
  5         left outer join t2
  6           on (   t1.id = t2.t1_id
  7              and t2.status in ('x','y')
  8              )
  9   order by t1.id
 10  /

   ID EXISTS_T2
----- ----------
    1 no
    2 yes
    3 yes
    4 no

4 rows selected.

And it is done in 4 consistent gets using this plan:

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2mtksgp0pzx59, child number 0
-------------------------------------
select /*+ gather_plan_statistics */        distinct t1.id      ,
nvl2(t2.id,'yes','no') exists_t2   from t1        left outer join t2
      on (   t1.id = t2.t1_id             and t2.status in ('x','y')
         )  order by t1.id

Plan hash value: 1021492722

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |      1 |        |      4 |00:00:00.01 |       4 |       |       |          |
|   1 |  SORT UNIQUE                    |       |      1 |      4 |      4 |00:00:00.01 |       4 |  2048 |  2048 | 2048  (0)|
|   2 |   MERGE JOIN OUTER              |       |      1 |      4 |      5 |00:00:00.01 |       4 |       |       |          |
|   3 |    INDEX FULL SCAN              | T1_PK |      1 |      4 |      4 |00:00:00.01 |       1 |       |       |          |
|*  4 |    SORT JOIN                    |       |      4 |      4 |      3 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
|   5 |     INLIST ITERATOR             |       |      1 |        |      3 |00:00:00.01 |       3 |       |       |          |
|   6 |      TABLE ACCESS BY INDEX ROWID| T2    |      2 |      4 |      3 |00:00:00.01 |       3 |       |       |          |
|*  7 |       INDEX RANGE SCAN          | T2_I  |      2 |      4 |      3 |00:00:00.01 |       2 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."ID"="T2"."T1_ID")
       filter("T1"."ID"="T2"."T1_ID")
   7 - access(("T2"."STATUS"='x' OR "T2"."STATUS"='y'))


29 rows selected.

Later, I offered an alternative, after realizing I can also use EXISTS in the SELECT clause, like this:

SQL> select /*+ gather_plan_statistics */
  2         t1.id
  3       , case
  4         when exists
  5           ( select ''
  6               from t2
  7              where t2.t1_id = t1.id
  8                and t2.status in ('x','y')
  9           )
 10         then 'yes'
 11         else 'no'
 12         end exists_t2
 13    from t1
 14   order by t1.id
 15  /

   ID EXISTS_T2
----- ----------
    1 no
    2 yes
    3 yes
    4 no

4 rows selected.

So, that's the proof that you can use an EXISTS in a SELECT clause. But this query uses 8 consistent gets (6+2):

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  b13u3f5m9ybuu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */        t1.id      , case
when exists          ( select ''              from t2             where
t2.t1_id = t1.id               and t2.status in ('x','y')          )
    then 'yes'        else 'no'        end exists_t2   from t1  order
by t1.id

Plan hash value: 1623664937

-------------------------------------------------------------------------------------
| Id  | Operation         | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |      1 |        |      4 |00:00:00.01 |       2 |
|   1 |  INLIST ITERATOR  |       |      4 |        |      2 |00:00:00.01 |       6 |
|*  2 |   INDEX RANGE SCAN| T2_I  |      6 |      1 |      2 |00:00:00.01 |       6 |
|   3 |  INDEX FULL SCAN  | T1_PK |      1 |      4 |      4 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access((("T2"."STATUS"='x' OR "T2"."STATUS"='y')) AND
              "T2"."T1_ID"=:B1)


25 rows selected.

So not really interesting with this data set, although the plan is shorter in number of operations. With other data distributions it can be useful though. For example, when blowing up the t2 table:

SQL> insert into t2
  2  select n * 1000 + t2.id
  3       , t2.t1_id
  4       , t2.status
  5       , t2.filler
  6    from t2
  7         cross join (select level n from dual connect by level <= 1000)
  8  /

8000 rows created.

SQL> exec dbms_stats.gather_table_stats(user,'t2',cascade=>true,method_opt=>'for all columns size auto')

PL/SQL procedure successfully completed.

Now, the left outer join with the distinct has to join 4 rows with 3003 joins, and the resulting SQL uses 3017 consistent gets:

SQL> select /*+ gather_plan_statistics */
  2         distinct t1.id
  3       , nvl2(t2.id,'yes','no') exists_t2
  4    from t1
  5         left outer join t2
  6           on (   t1.id = t2.t1_id
  7              and t2.status in ('x','y')
  8              )
  9   order by t1.id
 10  /

   ID EXISTS_T2
----- ----------
    1 no
    2 yes
    3 yes
    4 no

4 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2mtksgp0pzx59, child number 0
-------------------------------------
select /*+ gather_plan_statistics */        distinct t1.id      ,
nvl2(t2.id,'yes','no') exists_t2   from t1        left outer join t2
      on (   t1.id = t2.t1_id             and t2.status in ('x','y')
         )  order by t1.id

Plan hash value: 1021492722

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |      1 |        |      4 |00:00:00.01 |    3017 |       |       |          |
|   1 |  SORT UNIQUE                    |       |      1 |      4 |      4 |00:00:00.01 |    3017 |  2048 |  2048 | 2048  (0)|
|   2 |   MERGE JOIN OUTER              |       |      1 |      4 |   3005 |00:00:00.01 |    3017 |       |       |          |
|   3 |    INDEX FULL SCAN              | T1_PK |      1 |      4 |      4 |00:00:00.01 |       1 |       |       |          |
|*  4 |    SORT JOIN                    |       |      4 |      4 |   3003 |00:00:00.01 |    3016 |   115K|   115K|  102K (0)|
|   5 |     INLIST ITERATOR             |       |      1 |        |   3003 |00:00:00.01 |    3016 |       |       |          |
|   6 |      TABLE ACCESS BY INDEX ROWID| T2    |      2 |      4 |   3003 |00:00:00.01 |    3016 |       |       |          |
|*  7 |       INDEX RANGE SCAN          | T2_I  |      2 |      4 |   3003 |00:00:00.01 |      13 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."ID"="T2"."T1_ID")
       filter("T1"."ID"="T2"."T1_ID")
   7 - access(("T2"."STATUS"='x' OR "T2"."STATUS"='y'))


29 rows selected.

And the second query is now a whole lot more efficient with only 15 consistent gets:

SQL> select /*+ gather_plan_statistics */
  2         t1.id
  3       , case
  4         when exists
  5           ( select ''
  6               from t2
  7              where t2.t1_id = t1.id
  8                and t2.status in ('x','y')
  9           )
 10         then 'yes'
 11         else 'no'
 12         end exists_t2
 13    from t1
 14   order by t1.id
 15  /

   ID EXISTS_T2
----- ----------
    1 no
    2 yes
    3 yes
    4 no

4 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  b13u3f5m9ybuu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */        t1.id      , case
when exists          ( select ''              from t2             where
t2.t1_id = t1.id               and t2.status in ('x','y')          )
    then 'yes'        else 'no'        end exists_t2   from t1  order
by t1.id

Plan hash value: 1623664937

-------------------------------------------------------------------------------------
| Id  | Operation         | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |      1 |        |      4 |00:00:00.01 |       2 |
|   1 |  INLIST ITERATOR  |       |      4 |        |      2 |00:00:00.01 |      13 |
|*  2 |   INDEX RANGE SCAN| T2_I  |      6 |      1 |      2 |00:00:00.01 |      13 |
|   3 |  INDEX FULL SCAN  | T1_PK |      1 |      4 |      4 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access((("T2"."STATUS"='x' OR "T2"."STATUS"='y')) AND
              "T2"."T1_ID"=:B1)


25 rows selected.

The query selects the four rows of T1 by the INDEX FULL SCAN in line 3. The six starts of line 2 can be explained by the INLIST ITERATOR that splittes the T2 access into one for status = 'x' and one for status = 'y'. It searches for the 'x' for all four rows of T1. For t1.id 2 and 3, it finds an 'x' value, and it doesn't need to look for an 'y' value. Only for t1.id 1 and 4, it needs to search for an 'y' value.

Note that not only the data distribution is a factor, but the index on (t1_id,status) is needed as well to make a difference here.

I guess I just want to say that there is an alternative that can be useful in certain situations, that you might not have thought of initially.

PS: Something completely different: Kamran Agayev has posted an interview with me.

Tuesday, July 12, 2011

Partitioned outer join bug

I have experienced a strange bug when constructing an example showing how a partitioned outer join works. I used a setup which resembles a situation I have consulted about recently for timeseries data. The table can hold multiple timeseries whose data is coming from multiple sources. The data itself contains volumes that are reported for every 5 minutes. This table looks like this:

SQL> create table tradevolumes
  2  ( timeseries_id number(4)
  3  , tradedate     date
  4  , source_id     number(4)
  5  , volume        number(4)
  6  , constraint tradevolumes_pk primary key (timeseries_id,tradedate,source_id)
  7  ) organization index
  8  /

Table created.

An index organized table, which we fill with some data: 2 timeseries from 3 sources, and 108 five-minute-periods:

SQL> insert into tradevolumes
  2   select 1 + mod(level-1,2)
  3        , trunc(sysdate) + numtodsinterval(5*mod(trunc((level-1)/2),108),'minute')
  4        , ceil(level/(2 * 108))
  5        , ceil(dbms_random.value(0,1000))
  6     from dual
  7  connect by level <= 2 * 108 * 3
  8  /

648 rows created.

A common problem for timeseries data is missing data. I'll simulate that by deleting several rows:

SQL> delete tradevolumes
  2   where (  tradedate
  3            between trunc(sysdate) + interval '35' minute
  4            and trunc(sysdate) + interval '50' minute
  5         or (timeseries_id = 1 and source_id = 3 and volume < 30)
  6         or (timeseries_id = 2 and source_id = 2 and tradedate > trunc(sysdate) + interval '20' minute)
  7         )
  8  /

127 rows deleted.

And now gather some basic statistics for the table:

SQL> exec dbms_stats.gather_table_stats(user,'tradevolumes')

PL/SQL procedure successfully completed.

To get an idea of the data that is now in the table:

SQL> select *
  2    from tradevolumes
  3   order by timeseries_id
  4       , tradedate
  5       , source_id
  6  /

TIMESERIES_ID TRADEDATE            SOURCE_ID     VOLUME
------------- ------------------- ---------- ----------
            1 12-07-2011 00:00:00          1        400
            1 12-07-2011 00:00:00          2        265
            1 12-07-2011 00:00:00          3        307
            1 12-07-2011 00:05:00          1        223
            1 12-07-2011 00:05:00          2        975
            1 12-07-2011 00:05:00          3        173
            1 12-07-2011 00:10:00          1        794
            1 12-07-2011 00:10:00          2         17
            1 12-07-2011 00:10:00          3        468
            1 12-07-2011 00:15:00          1        552
            1 12-07-2011 00:15:00          2        979
            1 12-07-2011 00:15:00          3         84
            1 12-07-2011 00:20:00          1         73
            1 12-07-2011 00:20:00          2        710
            1 12-07-2011 00:20:00          3         54
            1 12-07-2011 00:25:00          1        271
            1 12-07-2011 00:25:00          2        643
            1 12-07-2011 00:25:00          3        443
            1 12-07-2011 00:30:00          1        408
            1 12-07-2011 00:30:00          2        526
            1 12-07-2011 00:30:00          3        725
            1 12-07-2011 00:55:00          1         13
            1 12-07-2011 00:55:00          2        943
            1 12-07-2011 00:55:00          3        319
... skipped a lot of rows here ...
            1 12-07-2011 08:50:00          1        239
            1 12-07-2011 08:50:00          2        502
            1 12-07-2011 08:50:00          3        297
            1 12-07-2011 08:55:00          1        620
            1 12-07-2011 08:55:00          2        531
            1 12-07-2011 08:55:00          3        377
            2 12-07-2011 00:00:00          1        778
            2 12-07-2011 00:00:00          2        302
            2 12-07-2011 00:00:00          3        959
            2 12-07-2011 00:05:00          1        920
            2 12-07-2011 00:05:00          2        827
            2 12-07-2011 00:05:00          3        395
... skipped a lot of rows here ...
            2 12-07-2011 08:45:00          1        495
            2 12-07-2011 08:45:00          3        519
            2 12-07-2011 08:50:00          1        556
            2 12-07-2011 08:50:00          3        193
            2 12-07-2011 08:55:00          1        368
            2 12-07-2011 08:55:00          3         18

521 rows selected.

To show how a partitioned outer join works, my plan is to add all missing key combinations to the result set for the first hour. So the query only uses the times between 0:00 to 0:55 and we'll ignore the rows between 1:00 and 8:55. This is the data for the first hour:

SQL> select *
  2    from tradevolumes
  3   where tradedate between trunc(sysdate) and trunc(sysdate) + interval '55' minute
  4   order by timeseries_id
  5       , tradedate
  6       , source_id
  7  /

TIMESERIES_ID TRADEDATE            SOURCE_ID     VOLUME
------------- ------------------- ---------- ----------
            1 12-07-2011 00:00:00          1        400
            1 12-07-2011 00:00:00          2        265
            1 12-07-2011 00:00:00          3        307
            1 12-07-2011 00:05:00          1        223
            1 12-07-2011 00:05:00          2        975
            1 12-07-2011 00:05:00          3        173
            1 12-07-2011 00:10:00          1        794
            1 12-07-2011 00:10:00          2         17
            1 12-07-2011 00:10:00          3        468
            1 12-07-2011 00:15:00          1        552
            1 12-07-2011 00:15:00          2        979
            1 12-07-2011 00:15:00          3         84
            1 12-07-2011 00:20:00          1         73
            1 12-07-2011 00:20:00          2        710
            1 12-07-2011 00:20:00          3         54
            1 12-07-2011 00:25:00          1        271
            1 12-07-2011 00:25:00          2        643
            1 12-07-2011 00:25:00          3        443
            1 12-07-2011 00:30:00          1        408
            1 12-07-2011 00:30:00          2        526
            1 12-07-2011 00:30:00          3        725
            1 12-07-2011 00:55:00          1         13
            1 12-07-2011 00:55:00          2        943
            1 12-07-2011 00:55:00          3        319
            2 12-07-2011 00:00:00          1        778
            2 12-07-2011 00:00:00          2        302
            2 12-07-2011 00:00:00          3        959
            2 12-07-2011 00:05:00          1        920
            2 12-07-2011 00:05:00          2        827
            2 12-07-2011 00:05:00          3        395
            2 12-07-2011 00:10:00          1        534
            2 12-07-2011 00:10:00          2        156
            2 12-07-2011 00:10:00          3          2
            2 12-07-2011 00:15:00          1        642
            2 12-07-2011 00:15:00          2        278
            2 12-07-2011 00:15:00          3         76
            2 12-07-2011 00:20:00          1        472
            2 12-07-2011 00:20:00          2        210
            2 12-07-2011 00:20:00          3        549
            2 12-07-2011 00:25:00          1        349
            2 12-07-2011 00:25:00          3        332
            2 12-07-2011 00:30:00          1        371
            2 12-07-2011 00:30:00          3        396
            2 12-07-2011 00:55:00          1        408
            2 12-07-2011 00:55:00          3        818

45 rows selected.

The 12 five-minute-periods should be present for the 2 timeseries and 3 sources, so we need 72 rows. Now watch the bug with a standard partitioned outer join:

SQL> with all_tradedates as
  2  ( select trunc(sysdate) + numtodsinterval(5*(level-1),'minute') tradedate
  3      from dual
  4   connect by level <= 12
  5  )
  6  select v.timeseries_id
  7       , a.tradedate
  8       , v.source_id
  9       , v.volume
 10    from all_tradedates a
 11         left outer join tradevolumes v
 12           partition by (v.timeseries_id,v.source_id)
 13           on (a.tradedate = v.tradedate)
 14   order by v.timeseries_id
 15       , a.tradedate
 16       , v.source_id
 17  /

TIMESERIES_ID TRADEDATE            SOURCE_ID     VOLUME
------------- ------------------- ---------- ----------
            1 12-07-2011 00:00:00          1        400
            1 12-07-2011 00:00:00          2        265
            1 12-07-2011 00:00:00          3        307
            1 12-07-2011 00:05:00          1        400
            1 12-07-2011 00:05:00          2        265
            1 12-07-2011 00:05:00          3        307
            1 12-07-2011 00:10:00          1        400
            1 12-07-2011 00:10:00          2        265
            1 12-07-2011 00:10:00          3        307
            1 12-07-2011 00:15:00          1        400
            1 12-07-2011 00:15:00          2        265
            1 12-07-2011 00:15:00          3        307
            1 12-07-2011 00:20:00          1        400
            1 12-07-2011 00:20:00          2        265
            1 12-07-2011 00:20:00          3        307
            1 12-07-2011 00:25:00          1        400
            1 12-07-2011 00:25:00          2        265
            1 12-07-2011 00:25:00          3        307
            1 12-07-2011 00:30:00          1        400
            1 12-07-2011 00:30:00          2        265
            1 12-07-2011 00:30:00          3        307
            1 12-07-2011 00:35:00          1        400
            1 12-07-2011 00:35:00          2        265
            1 12-07-2011 00:35:00          3        307
            1 12-07-2011 00:40:00          1        400
            1 12-07-2011 00:40:00          2        265
            1 12-07-2011 00:40:00          3        307
            1 12-07-2011 00:45:00          1        400
            1 12-07-2011 00:45:00          2        265
            1 12-07-2011 00:45:00          3        307
            1 12-07-2011 00:50:00          1        400
            1 12-07-2011 00:50:00          2        265
            1 12-07-2011 00:50:00          3        307
            1 12-07-2011 00:55:00          1        400
            1 12-07-2011 00:55:00          2        265
            1 12-07-2011 00:55:00          3        307
            2 12-07-2011 00:00:00          1        778
            2 12-07-2011 00:00:00          2        302
            2 12-07-2011 00:00:00          3        959
            2 12-07-2011 00:05:00          1        778
            2 12-07-2011 00:05:00          2        302
            2 12-07-2011 00:05:00          3        959
            2 12-07-2011 00:10:00          1        778
            2 12-07-2011 00:10:00          2        302
            2 12-07-2011 00:10:00          3        959
            2 12-07-2011 00:15:00          1        778
            2 12-07-2011 00:15:00          2        302
            2 12-07-2011 00:15:00          3        959
            2 12-07-2011 00:20:00          1        778
            2 12-07-2011 00:20:00          2        302
            2 12-07-2011 00:20:00          3        959
            2 12-07-2011 00:25:00          1        778
            2 12-07-2011 00:25:00          2        302
            2 12-07-2011 00:25:00          3        959
            2 12-07-2011 00:30:00          1        778
            2 12-07-2011 00:30:00          2        302
            2 12-07-2011 00:30:00          3        959
            2 12-07-2011 00:35:00          1        778
            2 12-07-2011 00:35:00          2        302
            2 12-07-2011 00:35:00          3        959
            2 12-07-2011 00:40:00          1        778
            2 12-07-2011 00:40:00          2        302
            2 12-07-2011 00:40:00          3        959
            2 12-07-2011 00:45:00          1        778
            2 12-07-2011 00:45:00          2        302
            2 12-07-2011 00:45:00          3        959
            2 12-07-2011 00:50:00          1        778
            2 12-07-2011 00:50:00          2        302
            2 12-07-2011 00:50:00          3        959
            2 12-07-2011 00:55:00          1        778
            2 12-07-2011 00:55:00          2        302
            2 12-07-2011 00:55:00          3        959

72 rows selected.

The query returns 72 rows as expected, but look at the volume column. When you make up new rows, those new rows should have a null value. And there's more: the rows that existed and weren't made up, now have a volume equaling the volume of "00:00:00" and not their own value...

You might ask: why did you create 108 rows? That's because I've found that when you use 107 rows or less, the bug doesn't appear. With 108 rows or more it creeps up. Here is the same sequence with a table containing 107 rows:

SQL> drop table tradevolumes purge
  2  /

Table dropped.

SQL> create table tradevolumes
  2  ( timeseries_id number(4)
  3  , tradedate     date
  4  , source_id     number(4)
  5  , volume        number(4)
  6  , constraint tradevolumes_pk primary key (timeseries_id,tradedate,source_id)
  7  ) organization index
  8  /

Table created.

SQL> insert into tradevolumes
  2   select 1 + mod(level-1,2)
  3        , trunc(sysdate) + numtodsinterval(5*mod(trunc((level-1)/2),107),'minute')
  4        , ceil(level/(2 * 107))
  5        , ceil(dbms_random.value(0,1000))
  6     from dual
  7  connect by level <= 2 * 107 * 3
  8  /

642 rows created.

SQL> delete tradevolumes
  2   where (  tradedate
  3            between trunc(sysdate) + interval '35' minute
  4            and trunc(sysdate) + interval '50' minute
  5         or (timeseries_id = 1 and source_id = 3 and volume < 30)
  6         or (timeseries_id = 2 and source_id = 2 and tradedate > trunc(sysdate) + interval '20' minute)
  7         )
  8  /

128 rows deleted.

SQL> exec dbms_stats.gather_table_stats(user,'tradevolumes')

PL/SQL procedure successfully completed.

SQL> select *
  2    from tradevolumes
  3   order by timeseries_id
  4       , tradedate
  5       , source_id
  6  /

TIMESERIES_ID TRADEDATE            SOURCE_ID     VOLUME
------------- ------------------- ---------- ----------
            1 12-07-2011 00:00:00          1        159
            1 12-07-2011 00:00:00          2        333
            1 12-07-2011 00:00:00          3        696
            1 12-07-2011 00:05:00          1        674
            1 12-07-2011 00:05:00          2        932
... skipped a lot of rows here ...
            2 12-07-2011 08:45:00          1        136
            2 12-07-2011 08:45:00          3        618
            2 12-07-2011 08:50:00          1        705
            2 12-07-2011 08:50:00          3        106

514 rows selected.

SQL> with all_tradedates as
  2  ( select trunc(sysdate) + numtodsinterval(5*(level-1),'minute') tradedate
  3      from dual
  4   connect by level <= 12
  5  )
  6  select v.timeseries_id
  7       , a.tradedate
  8       , v.source_id
  9       , v.volume
 10    from all_tradedates a
 11         left outer join tradevolumes v
 12           partition by (v.timeseries_id,v.source_id)
 13           on (a.tradedate = v.tradedate)
 14   order by v.timeseries_id
 15       , a.tradedate
 16       , v.source_id
 17  /

TIMESERIES_ID TRADEDATE            SOURCE_ID     VOLUME
------------- ------------------- ---------- ----------
            1 12-07-2011 00:00:00          1        159
            1 12-07-2011 00:00:00          2        333
            1 12-07-2011 00:00:00          3        696
            1 12-07-2011 00:05:00          1        674
            1 12-07-2011 00:05:00          2        932
            1 12-07-2011 00:05:00          3
            1 12-07-2011 00:10:00          1        315
            1 12-07-2011 00:10:00          2        872
            1 12-07-2011 00:10:00          3        778
            1 12-07-2011 00:15:00          1        398
            1 12-07-2011 00:15:00          2        272
            1 12-07-2011 00:15:00          3        239
            1 12-07-2011 00:20:00          1        546
            1 12-07-2011 00:20:00          2        863
            1 12-07-2011 00:20:00          3        839
            1 12-07-2011 00:25:00          1        558
            1 12-07-2011 00:25:00          2        576
            1 12-07-2011 00:25:00          3        296
            1 12-07-2011 00:30:00          1        767
            1 12-07-2011 00:30:00          2        957
            1 12-07-2011 00:30:00          3        935
            1 12-07-2011 00:35:00          1
            1 12-07-2011 00:35:00          2
            1 12-07-2011 00:35:00          3
            1 12-07-2011 00:40:00          1
            1 12-07-2011 00:40:00          2
            1 12-07-2011 00:40:00          3
            1 12-07-2011 00:45:00          1
            1 12-07-2011 00:45:00          2
            1 12-07-2011 00:45:00          3
            1 12-07-2011 00:50:00          1
            1 12-07-2011 00:50:00          2
            1 12-07-2011 00:50:00          3
            1 12-07-2011 00:55:00          1        398
            1 12-07-2011 00:55:00          2        110
            1 12-07-2011 00:55:00          3        629
            2 12-07-2011 00:00:00          1        615
            2 12-07-2011 00:00:00          2        924
            2 12-07-2011 00:00:00          3        526
            2 12-07-2011 00:05:00          1        147
            2 12-07-2011 00:05:00          2        589
            2 12-07-2011 00:05:00          3        779
            2 12-07-2011 00:10:00          1        554
            2 12-07-2011 00:10:00          2        108
            2 12-07-2011 00:10:00          3          9
            2 12-07-2011 00:15:00          1        601
            2 12-07-2011 00:15:00          2        111
            2 12-07-2011 00:15:00          3        526
            2 12-07-2011 00:20:00          1        418
            2 12-07-2011 00:20:00          2        818
            2 12-07-2011 00:20:00          3        578
            2 12-07-2011 00:25:00          1        824
            2 12-07-2011 00:25:00          2
            2 12-07-2011 00:25:00          3        112
            2 12-07-2011 00:30:00          1        649
            2 12-07-2011 00:30:00          2
            2 12-07-2011 00:30:00          3        689
            2 12-07-2011 00:35:00          1
            2 12-07-2011 00:35:00          2
            2 12-07-2011 00:35:00          3
            2 12-07-2011 00:40:00          1
            2 12-07-2011 00:40:00          2
            2 12-07-2011 00:40:00          3
            2 12-07-2011 00:45:00          1
            2 12-07-2011 00:45:00          2
            2 12-07-2011 00:45:00          3
            2 12-07-2011 00:50:00          1
            2 12-07-2011 00:50:00          2
            2 12-07-2011 00:50:00          3
            2 12-07-2011 00:55:00          1        651
            2 12-07-2011 00:55:00          2
            2 12-07-2011 00:55:00          3        641

72 rows selected.

And here the results are good. And when I add the rows for 08:55:00 afterwards:

SQL> insert into tradevolumes
  2  select timeseries_id
  3       , tradedate + interval '5' minute
  4       , source_id
  5       , 1000
  6    from tradevolumes
  7   where tradedate = (select max(tradedate) from tradevolumes)
  8  /

4 rows created.

SQL> commit
  2  /

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'tradevolumes')

PL/SQL procedure successfully completed.

SQL> with all_tradedates as
  2  ( select trunc(sysdate) + numtodsinterval(5*(level-1),'minute') tradedate
  3      from dual
  4   connect by level <= 12
  5  )
  6  select v.timeseries_id
  7       , a.tradedate
  8       , v.source_id
  9       , v.volume
 10    from all_tradedates a
 11         left outer join tradevolumes v
 12           partition by (v.timeseries_id,v.source_id)
 13           on (a.tradedate = v.tradedate)
 14   order by v.timeseries_id
 15       , a.tradedate
 16       , v.source_id
 17  /

TIMESERIES_ID TRADEDATE            SOURCE_ID     VOLUME
------------- ------------------- ---------- ----------
            1 12-07-2011 00:00:00          1        159
            1 12-07-2011 00:00:00          2        333
            1 12-07-2011 00:00:00          3        696
            1 12-07-2011 00:05:00          1        674
            1 12-07-2011 00:05:00          2        932
            1 12-07-2011 00:05:00          3
            1 12-07-2011 00:10:00          1        315
            1 12-07-2011 00:10:00          2        872
            1 12-07-2011 00:10:00          3        778
            1 12-07-2011 00:15:00          1        398
            1 12-07-2011 00:15:00          2        272
            1 12-07-2011 00:15:00          3        239
            1 12-07-2011 00:20:00          1        546
            1 12-07-2011 00:20:00          2        863
            1 12-07-2011 00:20:00          3        839
            1 12-07-2011 00:25:00          1        558
            1 12-07-2011 00:25:00          2        576
            1 12-07-2011 00:25:00          3        296
            1 12-07-2011 00:30:00          1        767
            1 12-07-2011 00:30:00          2        957
            1 12-07-2011 00:30:00          3        935
            1 12-07-2011 00:35:00          1
            1 12-07-2011 00:35:00          2
            1 12-07-2011 00:35:00          3
            1 12-07-2011 00:40:00          1
            1 12-07-2011 00:40:00          2
            1 12-07-2011 00:40:00          3
            1 12-07-2011 00:45:00          1
            1 12-07-2011 00:45:00          2
            1 12-07-2011 00:45:00          3
            1 12-07-2011 00:50:00          1
            1 12-07-2011 00:50:00          2
            1 12-07-2011 00:50:00          3
            1 12-07-2011 00:55:00          1        398
            1 12-07-2011 00:55:00          2        110
            1 12-07-2011 00:55:00          3        629
            2 12-07-2011 00:00:00          1        615
            2 12-07-2011 00:00:00          2        924
            2 12-07-2011 00:00:00          3        526
            2 12-07-2011 00:05:00          1        147
            2 12-07-2011 00:05:00          2        589
            2 12-07-2011 00:05:00          3        779
            2 12-07-2011 00:10:00          1        554
            2 12-07-2011 00:10:00          2        108
            2 12-07-2011 00:10:00          3          9
            2 12-07-2011 00:15:00          1        601
            2 12-07-2011 00:15:00          2        111
            2 12-07-2011 00:15:00          3        526
            2 12-07-2011 00:20:00          1        418
            2 12-07-2011 00:20:00          2        818
            2 12-07-2011 00:20:00          3        578
            2 12-07-2011 00:25:00          1        824
            2 12-07-2011 00:25:00          2
            2 12-07-2011 00:25:00          3        112
            2 12-07-2011 00:30:00          1        649
            2 12-07-2011 00:30:00          2
            2 12-07-2011 00:30:00          3        689
            2 12-07-2011 00:35:00          1
            2 12-07-2011 00:35:00          2
            2 12-07-2011 00:35:00          3
            2 12-07-2011 00:40:00          1
            2 12-07-2011 00:40:00          2
            2 12-07-2011 00:40:00          3
            2 12-07-2011 00:45:00          1
            2 12-07-2011 00:45:00          2
            2 12-07-2011 00:45:00          3
            2 12-07-2011 00:50:00          1
            2 12-07-2011 00:50:00          2
            2 12-07-2011 00:50:00          3
            2 12-07-2011 00:55:00          1        651
            2 12-07-2011 00:55:00          2
            2 12-07-2011 00:55:00          3        641

72 rows selected.

And now it works!

And using 108 rows with a regular heap table also isn't a problem:

SQL> drop table tradevolumes purge
  2  /

Table dropped.

SQL> create table tradevolumes
  2  ( timeseries_id number(4)
  3  , tradedate     date
  4  , source_id     number(4)
  5  , volume        number(4)
  6  , constraint tradevolumes_pk primary key (timeseries_id,tradedate,source_id)
  7  )
  8  /

Table created.

SQL> insert into tradevolumes
  2   select 1 + mod(level-1,2)
  3        , trunc(sysdate) + numtodsinterval(5*mod(trunc((level-1)/2),108),'minute')
  4        , ceil(level/(2 * 108))
  5        , ceil(dbms_random.value(0,1000))
  6     from dual
  7  connect by level <= 2 * 108 * 3
  8  /

648 rows created.

SQL> delete tradevolumes
  2   where (  tradedate
  3            between trunc(sysdate) + interval '35' minute
  4            and trunc(sysdate) + interval '50' minute
  5         or (timeseries_id = 1 and source_id = 3 and volume < 30)
  6         or (timeseries_id = 2 and source_id = 2 and tradedate > trunc(sysdate) + interval '20' minute)
  7         )
  8  /

126 rows deleted.

SQL> exec dbms_stats.gather_table_stats(user,'tradevolumes')

PL/SQL procedure successfully completed.

SQL> select *
  2    from tradevolumes
  3   order by timeseries_id
  4       , tradedate
  5       , source_id
  6  /

TIMESERIES_ID TRADEDATE            SOURCE_ID     VOLUME
------------- ------------------- ---------- ----------
            1 12-07-2011 00:00:00          1        200
            1 12-07-2011 00:00:00          2        157
            1 12-07-2011 00:00:00          3        950
            1 12-07-2011 00:05:00          1        707
            1 12-07-2011 00:05:00          2        932
            1 12-07-2011 00:05:00          3        258
... skipped a lot of rows here ...
            2 12-07-2011 08:50:00          1         61
            2 12-07-2011 08:50:00          3        974
            2 12-07-2011 08:55:00          1        692
            2 12-07-2011 08:55:00          3        714

522 rows selected.

SQL> with all_tradedates as
  2  ( select trunc(sysdate) + numtodsinterval(5*(level-1),'minute') tradedate
  3      from dual
  4   connect by level <= 12
  5  )
  6  select v.timeseries_id
  7       , a.tradedate
  8       , v.source_id
  9       , v.volume
 10    from all_tradedates a
 11         left outer join tradevolumes v
 12           partition by (v.timeseries_id,v.source_id)
 13           on (a.tradedate = v.tradedate)
 14   order by v.timeseries_id
 15       , a.tradedate
 16       , v.source_id
 17  /

TIMESERIES_ID TRADEDATE            SOURCE_ID     VOLUME
------------- ------------------- ---------- ----------
            1 12-07-2011 00:00:00          1        200
            1 12-07-2011 00:00:00          2        157
            1 12-07-2011 00:00:00          3        950
            1 12-07-2011 00:05:00          1        707
            1 12-07-2011 00:05:00          2        932
            1 12-07-2011 00:05:00          3        258
            1 12-07-2011 00:10:00          1        586
            1 12-07-2011 00:10:00          2        193
            1 12-07-2011 00:10:00          3        753
            1 12-07-2011 00:15:00          1        287
            1 12-07-2011 00:15:00          2        420
            1 12-07-2011 00:15:00          3        113
            1 12-07-2011 00:20:00          1        971
            1 12-07-2011 00:20:00          2         93
            1 12-07-2011 00:20:00          3        967
            1 12-07-2011 00:25:00          1        685
            1 12-07-2011 00:25:00          2         63
            1 12-07-2011 00:25:00          3        832
            1 12-07-2011 00:30:00          1        471
            1 12-07-2011 00:30:00          2        764
            1 12-07-2011 00:30:00          3        277
            1 12-07-2011 00:35:00          1
            1 12-07-2011 00:35:00          2
            1 12-07-2011 00:35:00          3
            1 12-07-2011 00:40:00          1
            1 12-07-2011 00:40:00          2
            1 12-07-2011 00:40:00          3
            1 12-07-2011 00:45:00          1
            1 12-07-2011 00:45:00          2
            1 12-07-2011 00:45:00          3
            1 12-07-2011 00:50:00          1
            1 12-07-2011 00:50:00          2
            1 12-07-2011 00:50:00          3
            1 12-07-2011 00:55:00          1        332
            1 12-07-2011 00:55:00          2        909
            1 12-07-2011 00:55:00          3        873
            2 12-07-2011 00:00:00          1        865
            2 12-07-2011 00:00:00          2        982
            2 12-07-2011 00:00:00          3        290
            2 12-07-2011 00:05:00          1        936
            2 12-07-2011 00:05:00          2        651
            2 12-07-2011 00:05:00          3        141
            2 12-07-2011 00:10:00          1        843
            2 12-07-2011 00:10:00          2        818
            2 12-07-2011 00:10:00          3        218
            2 12-07-2011 00:15:00          1         85
            2 12-07-2011 00:15:00          2        530
            2 12-07-2011 00:15:00          3        650
            2 12-07-2011 00:20:00          1        475
            2 12-07-2011 00:20:00          2        988
            2 12-07-2011 00:20:00          3        771
            2 12-07-2011 00:25:00          1        422
            2 12-07-2011 00:25:00          2
            2 12-07-2011 00:25:00          3         48
            2 12-07-2011 00:30:00          1         88
            2 12-07-2011 00:30:00          2
            2 12-07-2011 00:30:00          3        991
            2 12-07-2011 00:35:00          1
            2 12-07-2011 00:35:00          2
            2 12-07-2011 00:35:00          3
            2 12-07-2011 00:40:00          1
            2 12-07-2011 00:40:00          2
            2 12-07-2011 00:40:00          3
            2 12-07-2011 00:45:00          1
            2 12-07-2011 00:45:00          2
            2 12-07-2011 00:45:00          3
            2 12-07-2011 00:50:00          1
            2 12-07-2011 00:50:00          2
            2 12-07-2011 00:50:00          3
            2 12-07-2011 00:55:00          1        793
            2 12-07-2011 00:55:00          2
            2 12-07-2011 00:55:00          3        601

72 rows selected.

The above was all executed in version 11.2.0.2 and I've also tried it on 10.2.0.4 and I witnessed exactly the same phenomenon there. The only difference is that the boundary value is not 107/108 but 80/81.

I did some more basic investigations regarding space occupation and statistics, but I haven't found a clue so far. I'm wondering if I'm missing something obvious. So, if you have some more information about affected versions or the cause, I'd like to know.

Monday, July 11, 2011

Tutorials zur SQL Model Clause

Marcus Matzberger has translated five of my posts related to the SQL Model Clause for the German speaking Oracle community. Those posts now contain links to the German versions and I've listed them here for convenience as well.

SQL Model Clause Tutorial Part One

SQL Model Clause Tutorial Part Two

SQL Model Clause Tutorial Part Three

Calculating probabilities with N throws of a die

Choosing Between SQL and PL/SQL

Thanks Marcus!

Saturday, June 11, 2011

Back from Tallinn, Estonia

This morning I arrived back from a trip to Tallinn. Oracle Estonia had given me the opportunity to present my SQL Masterclass seminar at their training center in Tallinn, on Thursday and Friday. Thank you all those who spent two days hearing me. Here is a short story about my trip including some photos.

I arrived at Tallinn airport around 1PM local time on Wednesday. My hotel room was located at the 18th floor and I had this splendid view on the old city of Tallinn:

I spent the rest of the afternoon taking a city tour by bus, learning a few things about the history of Tallinn and strolling for an hour or two through the old city. Here are a few of the photos that I took during that stroll:

The seminar itself was on Thursday and Friday. I had spent a huge amount of time preparing the seminar. One of my goals was to present as many recognizable situations as possible that the attendees can apply at their daily job, so I was very pleased to receive some feedback that they indeed found several immediately applicable queries. There is still room for improvement though as a few parts were too short (I had planned one-and-a-half hour per part) and I had an error or two in my scripts that I had overlooked. Hopefully I will get a chance to fix those and present my seminar again somewhere. The feeling at the end was very positive though. The powerpoint and the scripts can be found at the "Presentations and papers" page.

This is a picture of the building where Oracle Estonia and its training center are located:

PS: As expected, during the preparation of the seminar I learnt a couple of new things. A bug with partitioned outer joins and something about connect-by-filtering. I will blog about them soon.