About Oracle: 2007

Thursday, December 27, 2007

Top 3 of 2007

I don't know about other countries, but here in the Netherlands every radio station seems to have its own top 50/100/500/2000 of hits. So I decided to go along with the trend and make my own top 3 ... of OTN forum posts that made me laugh in 2007:

1) How much a three year old DBA is worth?

2) PL/SQL New Development Standard: Frameworkia

3) Why Padfield earns twice as much as Robertson

Have fun reading and since this will be my last post for 2007: I wish all readers a happy and healthy 2008.

Monday, December 24, 2007

Check constraints with sysdate

Implementing a check constraint with sysdate in it, is not possible. Not until 11g that is, where you can use a trick to do this using virtual columns. Here is a small demo:

First, let's create a table and try to place a check constraint to check that a date should be a date in the past:

rwijk@ORA11G> create table mytable
  2  ( id     number(4)
  3  , mydate date
  4  )
  5  /

Tabel is aangemaakt.

rwijk@ORA11G> alter table mytable add constraint myck1 check (mydate < sysdate)
  2  /
alter table mytable add constraint myck1 check (mydate < sysdate)
                                                         *
FOUT in regel 1:
.ORA-02436: Datum- of systeemvariabele is onjuist opgegeven in CHECK-beperking.

As expected, ORA-02436 (date or system variable wrongly specified in CHECK constraint) is returned.

Wrapping the check inside a function does not help:

rwijk@ORA11G> create function mydate_in_past
  2  ( p_mydate in mytable.mydate%type
  3  ) return varchar2
  4  is
  5  begin
  6    return case when p_mydate >= sysdate then 'N' else 'Y' end;
  7  end mydate_in_past;
  8  /

Functie is aangemaakt.

rwijk@ORA11G> alter table mytable
  2    add constraint myck1 check (mydate_in_past(mydate) = 'Y')
  3  /
  add constraint myck1 check (mydate_in_past(mydate) = 'Y')
                              *
FOUT in regel 2:
.ORA-00904: "MYDATE_IN_PAST": ongeldige ID

Since a function is not allowed inside a check constraint. So let's add the function as a virtual column to the table:

rwijk@ORA11G> alter table mytable
  2    add (mydate_in_past_ind as (mydate_in_past(mydate)))
  3  /
  add (mydate_in_past_ind as (mydate_in_past(mydate)))
                              *
FOUT in regel 2:
.ORA-30553: De functie is niet deterministisch.

Oracle is clever enough to know that you can only use deterministic functions for virtual columns. And a function using sysdate is never deterministic of course. Validating input of 25-12-2007 will give FALSE as a result, but tomorrow it will be TRUE. But let's try to fool Oracle by just saying the function is deterministic, where it is not:

rwijk@ORA11G> create or replace function mydate_in_past
  2  ( p_mydate in mytable.mydate%type
  3  ) return varchar2 deterministic
  4  is
  5  begin
  6    return case when p_mydate >= sysdate then 'N' else 'Y' end;
  7  end mydate_in_past;
  8  /

Functie is aangemaakt.

rwijk@ORA11G> alter table mytable
  2    add (mydate_in_past_ind as (cast(mydate_in_past(mydate) as varchar2(1))))
  3  /

Tabel is gewijzigd.

rwijk@ORA11G> desc mytable
 Naam                                                                       Null?    Type
 -------------------------------------------------------------------------- -------- --------------------
 ID                                                                                  NUMBER(4)
 MYDATE                                                                              DATE
 MYDATE_IN_PAST_IND                                                                  VARCHAR2(1)

It works! And now you can check whether mydate is a date in the past:

rwijk@ORA11G> alter table mytable
  2    add constraint myck1 check (mydate_in_past_ind = 'Y')
  3  /

Tabel is gewijzigd.

So we have fooled Oracle by saying the function is deterministic and we can reason that in this case it is not bad, because when a row satisfies the constraint, it will keep satisfying the constraint as time passes.

Let's check if the constraint works:

rwijk@ORA11G> insert into mytable (id,mydate) values (1,date '2007-01-01')
  2  /

1 rij is aangemaakt.

rwijk@ORA11G> insert into mytable (id,mydate) values (2,date '2020-01-01')
  2  /
insert into mytable (id,mydate) values (2,date '2020-01-01')
*
FOUT in regel 1:
.ORA-02290: CHECK-beperking (RWIJK.MYCK1) is geschonden.

Great, but you have to be careful.
Watch what happens when checking whether a date is in the future. This cannot ever be a real constraint, as there will always be a time that a date will not be in the future anymore. But using the same technique as above it is possible:

rwijk@ORA11G> rollback
  2  /

Rollback is voltooid.

rwijk@ORA11G> alter table mytable add (mydate2 date)
  2  /

Tabel is gewijzigd.

rwijk@ORA11G> create function mydate_in_future
  2  ( p_mydate in mytable.mydate%type
  3  ) return varchar2 deterministic
  4  is
  5  begin
  6    return case when p_mydate > sysdate then 'Y' else 'N' end;
  7  end mydate_in_future;
  8  /

Functie is aangemaakt.

rwijk@ORA11G> alter table mytable
  2    add (mydate2_in_future_ind as (cast(mydate_in_future(mydate2) as varchar2(1))))
  3  /

Tabel is gewijzigd.

rwijk@ORA11G> alter table mytable
  2    add constraint myck2 check (mydate2_in_future_ind = 'Y')
  3  /

Tabel is gewijzigd.

rwijk@ORA11G> desc mytable
 Naam                                                                       Null?    Type
 -------------------------------------------------------------------------- -------- -----------------
 ID                                                                                  NUMBER(4)
 MYDATE                                                                              DATE
 MYDATE_IN_PAST_IND                                                                  VARCHAR2(1)
 MYDATE2                                                                             DATE
 MYDATE2_IN_FUTURE_IND                                                               VARCHAR2(1)

rwijk@ORA11G> insert into mytable (id,mydate2) values (1,date '2007-01-01')
  2  /
insert into mytable (id,mydate2) values (1,date '2007-01-01')
*
FOUT in regel 1:
.ORA-02290: CHECK-beperking (RWIJK.MYCK2) is geschonden.


rwijk@ORA11G> insert into mytable (id,mydate2) values (2,date '2020-01-01')
  2  /

1 rij is aangemaakt.

So it is quite easy to misuse this feature. But when used carefully it is useful for implementing more business rule declaratively.

Previously, this type of constraint would be implemented using a database trigger. But we all know by now :-) that database triggers are evil. For those of us that don't know this yet, let's do a small performance test, to prove this once again, after dropping all previous objects:

rwijk@ORA11G> create function mydate_in_past
  2  ( p_mydate in date
  3  ) return varchar2 deterministic
  4  is
  5  begin
  6    return case when p_mydate >= sysdate then 'N' else 'Y' end;
  7  end mydate_in_past;
  8  /

Functie is aangemaakt.

rwijk@ORA11G> create table mytable_virtual_column
  2  ( id     number(6)
  3  , mydate date
  4  , mydate_in_past_ind as (cast(mydate_in_past(mydate) as varchar2(1)))
  5  , constraint myck1 check (mydate_in_past_ind = 'Y')
  6  )
  7  /

Tabel is aangemaakt.

rwijk@ORA11G> create table mytable_database_trigger
  2  ( id     number(6)
  3  , mydate date
  4  )
  5  /

Tabel is aangemaakt.

rwijk@ORA11G> create trigger mytrigger
  2  before insert or update on mytable_database_trigger
  3  for each row
  4  begin
  5    if   inserting
  6      or updating('mydate')
  7    then
  8      if :new.mydate >= sysdate
  9      then
 10        raise_application_error(-20000,'mydate must be a date in the past.');
 11      end if;
 12    end if;
 13  end;
 14  /

Trigger is aangemaakt.

rwijk@ORA11G> exec runstats_pkg.rs_start

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> insert into mytable_virtual_column (id,mydate)
  2   select level
  3        , date '2007-01-01'
  4     from dual
  5  connect by level <= 100000
  6  /

100000 rijen zijn aangemaakt.

rwijk@ORA11G> exec runstats_pkg.rs_middle

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> insert into mytable_database_trigger (id,mydate)
  2   select level
  3        , date '2007-01-01'
  4     from dual
  5  connect by level <= 100000
  6  /

100000 rijen zijn aangemaakt.

rwijk@ORA11G> exec runstats_pkg.rs_stop(100)
Run1 draaide in 42 hsecs
Run2 draaide in 172 hsecs
Run1 draaide in 24,42% van de tijd

Naam                                                      Run1        Run2    Verschil
STAT.DB time                                                43         168         125
STAT.CPU used when call started                             42         169         127
STAT.CPU used by this session                               39         169         130
STAT.redo size                                       2,458,396   2,458,560         164
LATCH.object queue header operation                        773         994         221
STAT.free buffer inspected                                   0         250         250

Run1 latches totaal versus run2 -- verschil en percentage
      Run1      Run2  Verschil     Pct
    15,027    15,528       501  96.77%

PL/SQL-procedure is geslaagd.

I repeated the test several times, and got consistent results: using declarative constraints with virtual columns is approximately four times faster than using those dreaded database triggers.

Friday, December 21, 2007

Simulating print_table with unpivot

With the advent of UNPIVOT in Oracle11g, a SQL only solution for printing table contents vertically has been made a little easier. I've used Tom Kyte's print_table function for this purpose occasionally. You can read about this function here on AskTom. The following SQL using unpivot is definitely not easier than issuing

exec print_table('select * from emp')

, so it's just for fun:

rwijk@ORA11G>  select decode
  2          ( col
  3          , 'LINE', '======================'
  4          , rpad(col,8) || ': ' || value
  5          ) "Contents of table EMP"
  6     from ( select to_char(empno) empno
  7                 , ename
  8                 , job
  9                 , to_char(mgr) mgr
 10                 , to_char(hiredate,'dd-mm-yyyy') hiredate
 11                 , to_char(sal) sal
 12                 , to_char(comm) comm
 13                 , to_char(deptno) deptno
 14                 , '=' line
 15              from emp
 16          )
 17          unpivot include nulls
 18          (value for col in (empno,ename,job,mgr,hiredate,sal,comm,deptno,line))
 19  /

Contents of table EMP
--------------------------------------------------
EMPNO   : 7369
ENAME   : SMITH
JOB     : CLERK
MGR     : 7902
HIREDATE: 17-12-1980
SAL     : 800
COMM    :
DEPTNO  : 20
======================
EMPNO   : 7499
ENAME   : ALLEN
JOB     : SALESMAN
MGR     : 7698
HIREDATE: 20-02-1981
SAL     : 1600
COMM    : 300
DEPTNO  : 30
======================
EMPNO   : 7521
ENAME   : WARD
JOB     : SALESMAN
MGR     : 7698
HIREDATE: 22-02-1981
SAL     : 1250
COMM    : 500
DEPTNO  : 30
======================
EMPNO   : 7566
ENAME   : JONES
JOB     : MANAGER
MGR     : 7839
HIREDATE: 02-04-1981
SAL     : 2975
COMM    :
DEPTNO  : 20
======================
EMPNO   : 7654
ENAME   : MARTIN
JOB     : SALESMAN
MGR     : 7698
HIREDATE: 28-09-1981
SAL     : 1250
COMM    : 1400
DEPTNO  : 30
======================
EMPNO   : 7698
ENAME   : BLAKE
JOB     : MANAGER
MGR     : 7839
HIREDATE: 01-05-1981
SAL     : 2850
COMM    :
DEPTNO  : 30
======================
EMPNO   : 7782
ENAME   : CLARK
JOB     : MANAGER
MGR     : 7839
HIREDATE: 09-06-1981
SAL     : 2450
COMM    :
DEPTNO  : 10
======================
EMPNO   : 7788
ENAME   : SCOTT
JOB     : ANALYST
MGR     : 7566
HIREDATE: 09-12-1982
SAL     : 3000
COMM    :
DEPTNO  : 20
======================
EMPNO   : 7839
ENAME   : KING
JOB     : PRESIDENT
MGR     :
HIREDATE: 17-11-1981
SAL     : 5000
COMM    :
DEPTNO  : 10
======================
EMPNO   : 7844
ENAME   : TURNER
JOB     : SALESMAN
MGR     : 7698
HIREDATE: 08-09-1981
SAL     : 1500
COMM    : 0
DEPTNO  : 30
======================
EMPNO   : 7876
ENAME   : ADAMS
JOB     : CLERK
MGR     : 7788
HIREDATE: 12-01-1983
SAL     : 1100
COMM    :
DEPTNO  : 20
======================
EMPNO   : 7900
ENAME   : JAMES
JOB     : CLERK
MGR     : 7698
HIREDATE: 03-12-1981
SAL     : 950
COMM    :
DEPTNO  : 30
======================
EMPNO   : 7902
ENAME   : FORD
JOB     : ANALYST
MGR     : 7566
HIREDATE: 03-12-1981
SAL     : 3000
COMM    :
DEPTNO  : 20
======================
EMPNO   : 7934
ENAME   : MILLER
JOB     : CLERK
MGR     : 7782
HIREDATE: 23-01-1982
SAL     : 1300
COMM    :
DEPTNO  : 10
======================

126 rijen zijn geselecteerd.

Note that I have to convert all columns to varchar2's to be able to unpivot them. Also, the non default "including nulls" is used to explicitly generate rows when the COMM column is null.

Saturday, December 15, 2007

Parallellism in a skip locked scenario

During the Oracle 11g for developers session, more specifically when I was talking about the FOR UPDATE SKIP LOCKED, I received a question about the possibility of implementing parallellism if a batch program used the FOR UPDATE SKIP LOCKED strategy. Would it work to just kick off N processes simultaneously without altering the code?

Normally you would have to group the workload somehow - probably using the NTILE analytic function - and assign a piece of the workload to each of the slaves. But using the FOR UPDATE SKIP LOCKED, if one of the processes locks some rows, the other processes would skip those rows, and lock some others, and so on. Interesting question and I didn't know the answer right away, so I just tested it.

Let's create a table with a processed flag. This one is necessary in a skip locked scenario as you need to be able to track down which rows were processed and which ones were not. You can handle the ones that were skipped next day or next week.

rwijk@ORA11G> create table parallel_skiplocked_test
  2  (id, procesnumber, processed_flag, amount, processed_timestamp)
  3  as
  4   select level
  5        , cast(null as integer)
  6        , 'N'
  7        , cast(null as number(9))
  8        , cast(null as timestamp)
  9     from dual
 10  connect by level <= 100000
 11  /

Tabel is aangemaakt.

rwijk@ORA11G> alter table parallel_skiplocked_test add constraint pst_pk primary key (id)
  2  /

Tabel is gewijzigd.

rwijk@ORA11G> exec dbms_stats.gather_table_stats(user,'parallel_skiplocked_test',cascade=>true)

PL/SQL-procedure is geslaagd.

And create a procedure that processes all rows using the FOR UPDATE SKIP LOCKED clause:

rwijk@ORA11G> create procedure process_all(p_procesnumber in integer)
  2  is
  3    cursor c_pst
  4    is
  5    select id
  6         , amount
  7      from parallel_skiplocked_test
  8     where processed_flag = 'N'
  9           for update skip locked
 10    ;
 11    type tr_pst is record
 12    ( id     parallel_skiplocked_test.id%type
 13    , amount parallel_skiplocked_test.amount%type
 14    );
 15    type t_pst is table of tr_pst;
 16    r_pst t_pst;
 17  begin
 18    open c_pst;
 19    loop
 20      fetch c_pst bulk collect into r_pst limit 100
 21      ;
 22      for i in 1..r_pst.count
 23      loop
 24        r_pst(i).amount := trunc(mod(10000 * i * dbms_random.value,10000));
 25      end loop
 26      ;
 27      forall i in 1..r_pst.count
 28        update parallel_skiplocked_test
 29           set procesnumber = p_procesnumber
 30             , amount = r_pst(i).amount
 31             , processed_flag = 'Y'
 32             , processed_timestamp = systimestamp
 33         where id = r_pst(i).id
 34      ;
 35      exit when c_pst%notfound
 36      ;
 37    end loop;
 38    close c_pst;
 39  end process_all;
 40  /

Procedure is aangemaakt.

Now let's run the procedure in a single process using dbms_job:

rwijk@ORA11G> declare
  2    l_job_id1 binary_integer;
  3  begin
  4    dbms_job.submit(l_job_id1,'process_all(1);');
  5  end;
  6  /

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> commit
  2  /

Commit is voltooid.

This is the result:

rwijk@ORA11G> select procesnumber
  2       , count(*)
  3       , min(processed_timestamp) min_timestamp
  4       , max(processed_timestamp) max_timestamp
  5    from parallel_skiplocked_test
  6   group by procesnumber
  7  /

PROCESNUMBER   COUNT(*) MIN_TIMESTAMP             MAX_TIMESTAMP
------------ ---------- ------------------------- -------------------------
           1     100000 15-12-07 01:04:20,734000  15-12-07 01:04:30,375000

1 rij is geselecteerd.

It takes 10 seconds.
Now let's try with 4 processes simultaneously, after recreating the entire scenario the same way as above:

rwijk@ORA11G> declare
  2    l_job_id1 binary_integer;
  3    l_job_id2 binary_integer;
  4    l_job_id3 binary_integer;
  5    l_job_id4 binary_integer;
  6  begin
  7    dbms_job.submit(l_job_id1,'process_all(1);');
  8    dbms_job.submit(l_job_id2,'process_all(2);');
  9    dbms_job.submit(l_job_id3,'process_all(3);');
 10    dbms_job.submit(l_job_id4,'process_all(4);');
 11  end;
 12  /

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> commit
  2  /

Commit is voltooid.

rwijk@ORA11G> pause

rwijk@ORA11G> select procesnumber
  2       , count(*)
  3       , min(processed_timestamp) min_timestamp
  4       , max(processed_timestamp) max_timestamp
  5    from parallel_skiplocked_test
  6   group by procesnumber
  7  /

PROCESNUMBER   COUNT(*) MIN_TIMESTAMP             MAX_TIMESTAMP
------------ ---------- ------------------------- -------------------------
           1      23555 15-12-07 01:05:05,734000  15-12-07 01:05:16,656000
           2      25445 15-12-07 01:05:05,953000  15-12-07 01:05:17,703000
           4      26400 15-12-07 01:05:05,968000  15-12-07 01:05:16,656000
           3      24600 15-12-07 01:05:05,796000  15-12-07 01:05:16,640000

4 rijen zijn geselecteerd.

And look: it works! All 4 processes did approximately 25,000 rows. However, instead of 1 time 10 seconds, you now consumed 4 times 11 or 12 seconds. Likely because all four processes are hitting the same resources at the same time, causing contention. So it is of no practical use, but still fun to know. And thanks to Tijmen for the interesting question.

Friday, December 14, 2007

Several 11g oddities

In an earlier post about setting up journaling using flashback data archives I mentioned the risk of losing undo information when adding a constraint to the table. During the preparation for the session Oracle 11g for developers I discovered four oddities:

1) Result Cache:

In the concepts manual, there is a picture of the Result Cache in the SGA. The drawing tells us that the Result Cache is a separate memory area, just like you have the Shared Pool, Large Pool or Buffer Cache. However when executing the procedure dbms_result_cache.memory_report, the output shows this:

rwijk@ORA11G> exec dbms_result_cache.memory_report
R e s u l t   C a c h e   M e m o r y   R e p o r t
[Parameters]
Block Size          = 1K bytes
Maximum Cache Size  = 1056K bytes (1056 blocks)
Maximum Result Size = 52K bytes (52 blocks)
[Memory]
Total Memory = 5140 bytes [0.003% of the Shared Pool]

... Fixed Memory = 5140 bytes [0.003% of the Shared Pool]
... Dynamic Memory = 0 bytes [0.000% of the Shared Pool]

Which makes you believe that the Result Cache is part of the Shared Pool. So which one of the two is true? The important part for me to know is that it is in shared memory, but still the difference is strange.

2) Reference partitioning

Suppose the following situation: three tables in a master-detail-detail relationship. The master is range partitioned, the child table is reference partitioned, and the grandchild table is also reference partitioned. When trying to insert a row in the grandchild table that violates the foreign key you'd expect the rather well known error message ORA-02291, integrity constraint violated. The error message you receive however is this one:

ORA-14400: De ingevoerde partitiesleutel wordt niet toegewezen aan een partitie.

(translation: inserted partition key does not map to any partition)

3) Fine Grained Dependency Tracking

This feature reduces the odds that your packages become invalid, by storing the dependencies at the column level instead of table level. It works - I've seen it - but I cannot see the column dependencies anywhere. The obvious place, USER_DEPENDENCIES, does not show any column level information. The documentation does not seem to mention an alternative place where this information is stored. Hopefully I overlooked something, and I would appreciate it if someone would tell me where to find it. And I don't mean some X$-table, but in the regular USER_... views.

4) Native compilation

This one was staring at my face when I developed the demonstration script, but I missed it. Only after I was being notified by a colleague watching the session, I saw it too. When compiling a procedure natively, the PLSQL_OPTIMIZE_LEVEL has to be at least 2 according to the documentation. So in my script I set it to 1, and show that a warning is generated and the procedure is NOT natively compiled. But when querying the USER_PLSQL_OBJECT_SETTINGS view, it says that it is natively compiled:

rwijk@ORA11G> alter session set plsql_optimize_level = 1
  2  /

Sessie is gewijzigd.

rwijk@ORA11G> alter session set plsql_code_type = native
  2  /

Sessie is gewijzigd.

rwijk@ORA11G> create procedure p2
  2  is
  3  begin
  4    for i in 1..100000000
  5    loop
  6      null;
  7    end loop
  8    ;
  9    dbms_output.put_line('p2');
 10  end;
 11  /

SP2-0804: Procedure aangemaakt met compilatiewaarschuwingen

rwijk@ORA11G> show err
Fouten voor PROCEDURE P2:

LINE/COL ERROR
-------- -----------------------------------------------------------------
0/0      PLW-06014: Het genereren van native code is uitgeschakeld omdat
         PLSQL_OPTIMIZE_LEVEL <= 1.

rwijk@ORA11G> select name
  2       , type
  3       , plsql_code_type
  4       , plsql_optimize_level
  5       , plsql_warnings
  6    from user_plsql_object_settings
  7   where name like 'P_'
  8   order by name
  9  /

NAME                           TYPE         PLSQL_CODE_TYPE                PLSQL_OPTIMIZE_LEVEL PLSQL_WARNINGS
------------------------------ ------------ ------------------------------ -------------------- ------------------------------
P1                             PROCEDURE    INTERPRETED                                       2 ENABLE:ALL
P2                             PROCEDURE    NATIVE                                            1 ENABLE:ALL
P3                             PROCEDURE    NATIVE                                            2 ENABLE:ALL

Wednesday, December 12, 2007

Oracle 11g for developers

Today I will be doing an in-house knowledge session about Oracle 11g for Developers. The powerpoint presentation and the demonstration scripts can be downloaded here:

Powerpoint presentatie Oracle 11g voor ontwikkelaars
Bijbehorende demonstratiescripts

Warning: they are in Dutch.

Friday, December 7, 2007

Multicolumn statistics

Of all new features of 11g, there are the ones that are talked about a lot, like the result cache and virtual columns. I haven't seen much about multicolumn statistics yet, apart from appearing in several bullet lists about 11g. However, I have seen several performance issues in the past, that could have benefitted from this feature, so I am quite excited about this one. The feature addresses what Wolfgang Breitling calls "Fallacy II - The Predicate Independence Assumption" in his paper Fallacies of the Cost Based Optimizer. Here is an example of how it works:

Let's create a fake order table:

rwijk@ORA11G> create table orders
  2  as
  3   select level nr
  4        , 'Customer ' || to_char(level) customer_name
  5        , case
  6          when level <= 500 then 'London'
  7          when level <= 550 then 'New York'
  8          when level <= 600 then 'Paris'
  9          when level <= 650 then 'Vienna'
 10          when level <= 700 then 'Moscow'
 11          when level <= 750 then 'Berlin'
 12          when level <= 800 then 'Rome'
 13          when level <= 850 then 'Madrid'
 14          when level <= 900 then 'Brussels'
 15          else 'Amsterdam'
 16          end city
 17        , case
 18          when level <  500 then 'UK'
 19          when level <= 550 then 'USA'
 20          when level <= 600 then 'France'
 21          when level <= 650 then 'Austria'
 22          when level <= 700 then 'Russia'
 23          when level <= 750 then 'Germany'
 24          when level <= 800 then 'Italy'
 25          when level <= 850 then 'Spain'
 26          when level <= 900 then 'Belgium'
 27          else 'Netherlands'
 28          end country
 29     from dual
 30  connect by level <= 1000
 31  /

Tabel is aangemaakt.

And gather statistics normally.

rwijk@ORA11G> exec dbms_stats.gather_table_stats(user,'orders')

PL/SQL-procedure is geslaagd.

Because the case expression in the create table statement may not read easily, I did these select statements to give you a better idea of what is in the table:

rwijk@ORA11G> select city
  2       , count(*)
  3    from orders
  4   group by city
  5   order by city
  6  /

CITY                                    COUNT(*)
--------- --------------------------------------
Amsterdam                                    100
Berlin                                        50
Brussels                                      50
London                                       500
Madrid                                        50
Moscow                                        50
New York                                      50
Paris                                         50
Rome                                          50
Vienna                                        50

10 rijen zijn geselecteerd.

rwijk@ORA11G> select country
  2       , count(*)
  3    from orders
  4   group by country
  5   order by country
  6  /

COUNTRY                                   COUNT(*)
----------- --------------------------------------
Austria                                         50
Belgium                                         50
France                                          50
Germany                                         50
Italy                                           50
Netherlands                                    100
Russia                                          50
Spain                                           50
UK                                             499
USA                                             51

10 rijen zijn geselecteerd.

rwijk@ORA11G> select city
  2       , country
  3       , count(*)
  4    from orders
  5   group by city
  6       , country
  7   order by city
  8       , country
  9  /

CITY      COUNTRY                                   COUNT(*)
--------- ----------- --------------------------------------
Amsterdam Netherlands                                    100
Berlin    Germany                                         50
Brussels  Belgium                                         50
London    UK                                             499
London    USA                                              1
Madrid    Spain                                           50
Moscow    Russia                                          50
New York  USA                                             50
Paris     France                                          50
Rome      Italy                                           50
Vienna    Austria                                         50

11 rijen zijn geselecteerd.

So 10 cities, 10 countries, with a clear dependency between the columns city and country. Although not exactly one-on-one, because - as everybody knows :-) - London is also a town in Ohio, USA. Don't get extracted by this fact too much; it won't play a role in the example.

Ok, so let's have a look at the cardinalities in the following query:

rwijk@ORA11G> explain plan
  2  for
  3  select *
  4    from orders
  5   where city = 'London'
  6     and country = 'UK'
  7  /

Uitleg is gegeven.

rwijk@ORA11G> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------

Plan hash value: 1275100350

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |    10 |   290 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| ORDERS |    10 |   290 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CITY"='London' AND "COUNTRY"='UK')

13 rijen zijn geselecteerd.

Since histograms are not present on the table yet, the cost based optimizer has predicted 10 rows, because the filter predicate city = 'London' has 10 distinct values, and the predicate country = 'UK' also has 10 distinct values. Because of the predicate independence assumption, the cost based optimizer calculates 1000 rows * (1/10) * (1/10) = 10 rows.

Also it doesn't matter which value I put in the query:

rwijk@ORA11G> exec dbms_lock.sleep(1)

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> explain plan
  2  for
  3  select *
  4    from orders
  5   where city = 'London'
  6     and country = 'Netherlands'
  7  /

Uitleg is gegeven.

rwijk@ORA11G> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------

Plan hash value: 1275100350

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |    10 |   290 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| ORDERS |    10 |   290 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CITY"='London' AND "COUNTRY"='Netherlands')

13 rijen zijn geselecteerd.

Even when searching for London in the Netherlands, it predicts 10. We all know that the query would of course return 0 rows, but the cost based optimizer does not.

So let's create histograms on the columns, to have better estimates:

rwijk@ORA11G> exec dbms_stats.gather_table_stats(user,'orders',
method_opt=>'FOR ALL COLUMNS')

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> explain plan
  2  for
  3  select *
  4    from orders
  5   where city = 'London'
  6     and country = 'UK'
  7  /

Uitleg is gegeven.

rwijk@ORA11G> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------

Plan hash value: 1275100350

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |   250 |  7250 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| ORDERS |   250 |  7250 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COUNTRY"='UK' AND "CITY"='London')

13 rijen zijn geselecteerd.

Now the prediction is 250 rows. This is calculated based on the histogram information that tells the cost based optimizer that London occurs 500 times out of 1000 rows and the UK occurs 499 times out of 1000 rows. 1000 rows * (500/1000) * (499/1000) is approximately 250 rows.

And for the query using London in the Netherlands:

rwijk@ORA11G> exec dbms_lock.sleep(1)

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> explain plan
  2  for
  3  select *
  4    from orders
  5   where city = 'London'
  6     and country = 'Netherlands'
  7  /

Uitleg is gegeven.

rwijk@ORA11G> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------

Plan hash value: 1275100350

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |    50 |  1450 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| ORDERS |    50 |  1450 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COUNTRY"='Netherlands' AND "CITY"='London')

13 rijen zijn geselecteerd.

Now 50 rows are calculated, because 1000 rows * (500/1000) * (100/1000) = 50.

The above is the best you could get before 11g. With the use of multicolumn statistics, the row counts will be REALLY accurate:

rwijk@ORA11G> exec dbms_stats.gather_table_stats(user,'orders',
method_opt=>'FOR COLUMNS (city,country)')

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> pause

rwijk@ORA11G> explain plan
  2  for
  3  select *
  4    from orders
  5   where city = 'London'
  6     and country = 'UK'
  7  /

Uitleg is gegeven.

rwijk@ORA11G> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------

Plan hash value: 1275100350

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |   499 | 15469 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| ORDERS |   499 | 15469 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COUNTRY"='UK' AND "CITY"='London')

13 rijen zijn geselecteerd.

499 exactly! And for London in the Netherlands:

rwijk@ORA11G> exec dbms_lock.sleep(1)

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> explain plan
  2  for
  3  select *
  4    from orders
  5   where city = 'London'
  6     and country = 'Netherlands'
  7  /

Uitleg is gegeven.

rwijk@ORA11G> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------

Plan hash value: 1275100350

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |     1 |    31 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| ORDERS |     1 |    31 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COUNTRY"='Netherlands' AND "CITY"='London')

13 rijen zijn geselecteerd.

Since the optimizer won't calculate with 0, it uses 1, but again very accurate! Much more precise cardinality estimates will lead to better decisions regarding whether to use an index or not, for example. It can really make a big difference in a production system. Problem queries can now be tuned more often by analyzing instead of adding hints to your code.

Monday, November 26, 2007

Setting up journaling using flashback data archive

One of 11g new features is Flashback Data Archive, or in marketing terms: Oracle Total Recall. With earlier flashback technologies it was possible to access Oracle's undo information for as long as it was available. With flashback archives you can extend this period for as long as you specify. This makes it possible to implement journaling using this new technique. This blog entry describes how to do this and a number of things you need to be aware of.

Before 11g, journaling was typically implemented using a separate table with the same columns as the base table, plus three additional columns: jn_operation (Insert/Update/Delete), jn_user and jn_datetime. Three after row triggers on the base table made sure every DML was recorded in the journaling table. So let's see how the same can be achieved using flashback archives.

First of all I create a tablespace especially for flashback archives:

rwijk@ORA11G> create tablespace my_tablespace datafile 'extra_file.dat' size 10M
  2  /

Tabelruimte is aangemaakt.

In the tablespace I can define several flashback archives each with its own retention period. For example:

rwijk@ORA11G> create flashback archive flashback_archive_10_years
  2  tablespace my_tablespace
  3  retention 10 year
  4  /

Flashbackarchief is aangemaakt.

rwijk@ORA11G> create flashback archive flashback_archive_5_years
  2  tablespace my_tablespace
  3  retention 5 year
  4  /

Flashbackarchief is aangemaakt.

And I create a table my_emp and attach it to the flashback archive with the 10 year retention period:

rwijk@ORA11G> create table my_emp
  2  ( empno    number(4) not null
  3  , ename    varchar2(10)
  4  , sal      number(7,2)
  5  , deptno   number(2)
  6  ) partition by range (deptno)
  7    ( partition deptno10 values less than (11)
  8    , partition deptno20 values less than (21)
  9    , partition deptno30 values less than (31)
 10    , partition deptnorest values less than (maxvalue)
 11    ) flashback archive flashback_archive_10_years
 12  /

Tabel is aangemaakt.

The range partitioning is not required; I only do this to show one point later on in this blog entry.

The information about the flashback archives is available in the views DBA_FLASHBACK_ARCHIVE, DBA_FLASHBACK_ARCHIVE_TS and DBA_FLASHBACK_ARCHIVE_TABLES:

rwijk@ORA11G> select * from dba_flashback_archive
  2  /

FLASHBACK_ARCHIVE_NAME     FLASHBACK_ARCHIVE# RETENTION_IN_DAYS
-------------------------- ------------------ -----------------
CREATE_TIME                    LAST_PURGE_TIME                STATUS
------------------------------ ------------------------------ -------
FLASHBACK_ARCHIVE_5_YEARS                   2              1825
27-11-07 23:53:51,000000000    27-11-07 23:53:51,000000000

FLASHBACK_ARCHIVE_10_YEARS                  1              3650
27-11-07 23:53:51,000000000    27-11-07 23:53:51,000000000


2 rijen zijn geselecteerd.

rwijk@ORA11G> select * from dba_flashback_archive_ts
  2  /

FLASHBACK_ARCHIVE_NAME     FLASHBACK_ARCHIVE# TABLESPACE_NAME
-------------------------- ------------------ ------------------------------
QUOTA_IN_MB
----------------------------------------
FLASHBACK_ARCHIVE_5_YEARS                   2 MY_TABLESPACE


FLASHBACK_ARCHIVE_10_YEARS                  1 MY_TABLESPACE



2 rijen zijn geselecteerd.

rwijk@ORA11G> select * from dba_flashback_archive_tables
  2  /

TABLE_NAME                     OWNER_NAME
------------------------------ ------------------------------
FLASHBACK_ARCHIVE_NAME     ARCHIVE_TABLE_NAME
-------------------------- ------------------------------
MY_EMP                         RWIJK
FLASHBACK_ARCHIVE_10_YEARS SYS_FBA_HIST_71097


1 rij is geselecteerd.

Now let's do some DML against this new table:

rwijk@ORA11G> insert into my_emp
  2  select empno, ename, sal, deptno from emp
  3  /

14 rijen zijn aangemaakt.

rwijk@ORA11G> commit
  2  /

Commit is voltooid.

rwijk@ORA11G> update my_emp
  2     set sal = sal * 1.1
  3   where deptno = 20
  4  /

5 rijen zijn bijgewerkt.

rwijk@ORA11G> commit
  2  /

Commit is voltooid.

rwijk@ORA11G> update my_emp
  2     set ename = initcap(ename)
  3   where deptno != 30
  4  /

8 rijen zijn bijgewerkt.

rwijk@ORA11G> commit
  2  /

Commit is voltooid.

rwijk@ORA11G> insert into my_emp (empno,ename,sal,deptno)
  2  values (7777,'VAN WIJK', 1600, 40)
  3  /

1 rij is aangemaakt.

rwijk@ORA11G> delete my_emp
  2   where sal = 3300
  3  /

2 rijen zijn verwijderd.

rwijk@ORA11G> commit
  2  /

Commit is voltooid.

And use a Flashback Version Query to retrieve all information:

rwijk@ORA11G> select versions_xid
  2       , versions_startscn
  3       , versions_starttime
  4       , versions_endscn
  5       , versions_endtime
  6       , versions_operation
  7       , empno
  8       , ename
  9       , sal
 10       , deptno
 11    from my_emp versions between scn minvalue and maxvalue
 12   order by empno
 13       , versions_startscn
 14  /

VERSIONS_XID     VERSIONS_STARTSCN VERSIONS_STARTTIME   VERSIONS_ENDSCN
---------------- ----------------- -------------------- ---------------
VERSIONS_ENDTIME     V EMPNO ENAME        SAL DEPTNO
-------------------- - ----- ---------- ----- ------
0A00030079030000           1589358 27-11-07 23:58:39            1589361
27-11-07 23:58:39    I  7369 SMITH        800     20

0400070075030000           1589361 27-11-07 23:58:39            1589365
27-11-07 23:58:39    U  7369 SMITH        880     20

0700110078030000           1589365 27-11-07 23:58:39
                     U  7369 Smith        880     20

0A00030079030000           1589358 27-11-07 23:58:39
                     I  7499 ALLEN       1600     30

0A00030079030000           1589358 27-11-07 23:58:39
                     I  7521 WARD        1250     30

0A00030079030000           1589358 27-11-07 23:58:39            1589361
27-11-07 23:58:39    I  7566 JONES       2975     20

0400070075030000           1589361 27-11-07 23:58:39            1589365
27-11-07 23:58:39    U  7566 JONES       3273     20

0700110078030000           1589365 27-11-07 23:58:39
                     U  7566 Jones       3273     20

0A00030079030000           1589358 27-11-07 23:58:39
                     I  7654 MARTIN      1250     30

0A00030079030000           1589358 27-11-07 23:58:39
                     I  7698 BLAKE       2850     30

09001C00B5040000           1589368 27-11-07 23:58:39
                     I  7777 VAN WIJK    1600     40

0A00030079030000           1589358 27-11-07 23:58:39            1589365
27-11-07 23:58:39    I  7782 CLARK       2450     10

0700110078030000           1589365 27-11-07 23:58:39
                     U  7782 Clark       2450     10

0A00030079030000           1589358 27-11-07 23:58:39            1589361
27-11-07 23:58:39    I  7788 SCOTT       3000     20

0400070075030000           1589361 27-11-07 23:58:39            1589365
27-11-07 23:58:39    U  7788 SCOTT       3300     20

0700110078030000           1589365 27-11-07 23:58:39            1589368
27-11-07 23:58:39    U  7788 Scott       3300     20

09001C00B5040000           1589368 27-11-07 23:58:39
                     D  7788 Scott       3300     20

0A00030079030000           1589358 27-11-07 23:58:39            1589365
27-11-07 23:58:39    I  7839 KING        5000     10

0700110078030000           1589365 27-11-07 23:58:39
                     U  7839 King        5000     10

0A00030079030000           1589358 27-11-07 23:58:39
                     I  7844 TURNER      1500     30

0A00030079030000           1589358 27-11-07 23:58:39            1589361
27-11-07 23:58:39    I  7876 ADAMS       1100     20

0400070075030000           1589361 27-11-07 23:58:39            1589365
27-11-07 23:58:39    U  7876 ADAMS       1210     20

0700110078030000           1589365 27-11-07 23:58:39
                     U  7876 Adams       1210     20

0A00030079030000           1589358 27-11-07 23:58:39
                     I  7900 JAMES        950     30

0A00030079030000           1589358 27-11-07 23:58:39            1589361
27-11-07 23:58:39    I  7902 FORD        3000     20

0400070075030000           1589361 27-11-07 23:58:39            1589365
27-11-07 23:58:39    U  7902 FORD        3300     20

0700110078030000           1589365 27-11-07 23:58:39            1589368
27-11-07 23:58:39    U  7902 Ford        3300     20

09001C00B5040000           1589368 27-11-07 23:58:39
                     D  7902 Ford        3300     20

0A00030079030000           1589358 27-11-07 23:58:39            1589365
27-11-07 23:58:39    I  7934 MILLER      1300     10

0700110078030000           1589365 27-11-07 23:58:39
                     U  7934 Miller      1300     10


30 rijen zijn geselecteerd.

So indeed all journaling information is available here. All, but one: the user who modified the data is not there. So let's audit the table to get this piece of information as well and create a journaling view to join the audit information with the journaling info.

rwijk@ORA11G> audit insert,update,delete on my_emp by access
  2  /

Audit is geslaagd.

rwijk@ORA11G> create view my_emp_jn_v
  2  as
  3  select me.versions_operation jn_operation
  4       , ao.username jn_user
  5       , me.versions_starttime jn_datetime
  6       , me.versions_startscn jn_scn
  7       , me.empno
  8       , me.ename
  9       , me.sal
 10       , me.deptno
 11    from my_emp versions between scn minvalue and maxvalue me
 12       , user_audit_object ao
 13   where me.versions_xid = ao.transactionid (+)
 14  /

View is aangemaakt.

And do some more transactions, including one transaction in another session by a user called SOMEBODY_ELSE:

rwijk@ORA11G> insert into my_emp (empno,ename,sal,deptno)
  2  values (7778, 'ROB', 1700, 40)
  3  /

1 rij is aangemaakt.

rwijk@ORA11G> commit
  2  /

Commit is voltooid.

rwijk@ORA11G> create user somebody_else identified by somebody_else
  2  /

Gebruiker is aangemaakt.

rwijk@ORA11G> grant create session to somebody_else
  2  /

Toekennen is geslaagd.

rwijk@ORA11G> grant delete on my_emp to somebody_else
  2  /

Toekennen is geslaagd.

rwijk@ORA11G> remark  Start another session and connect as somebody_else
rwijk@ORA11G> remark  and issue the following commands:
rwijk@ORA11G> remark
rwijk@ORA11G> remark  delete rwijk.my_emp where empno = 7777
rwijk@ORA11G> remark  /
rwijk@ORA11G> remark  commit
rwijk@ORA11G> remark  /
rwijk@ORA11G> remark  exit
rwijk@ORA11G> pause

rwijk@ORA11G> drop user somebody_else
  2  /

Gebruiker is verwijderd.

rwijk@ORA11G> update my_emp
  2     set sal = 1800
  3   where empno = 7778
  4  /

1 rij is bijgewerkt.

rwijk@ORA11G> commit
  2  /

Commit is voltooid.

And look, the missing piece of information is also here:

rwijk@ORA11G> select *
  2    from my_emp_jn_v
  3   order by empno
  4       , jn_scn
  5  /

J JN_USER       JN_DATETIME           JN_SCN EMPNO ENAME        SAL DEPTNO
- ------------- ------------------- -------- ----- ---------- ----- ------
I               28-11-07 00:05:03    1590771  7369 SMITH        800     20
U               28-11-07 00:05:03    1590774  7369 SMITH        880     20
U               28-11-07 00:05:03    1590778  7369 Smith        880     20
I               28-11-07 00:05:03    1590771  7499 ALLEN       1600     30
I               28-11-07 00:05:03    1590771  7521 WARD        1250     30
I               28-11-07 00:05:03    1590771  7566 JONES       2975     20
U               28-11-07 00:05:03    1590774  7566 JONES       3273     20
U               28-11-07 00:05:03    1590778  7566 Jones       3273     20
I               28-11-07 00:05:03    1590771  7654 MARTIN      1250     30
I               28-11-07 00:05:03    1590771  7698 BLAKE       2850     30
I               28-11-07 00:05:03    1590781  7777 VAN WIJK    1600     40
D SOMEBODY_ELSE 28-11-07 00:05:21    1590838  7777 VAN WIJK    1600     40
I RWIJK         28-11-07 00:05:06    1590799  7778 ROB         1700     40
U RWIJK         28-11-07 00:05:33    1590859  7778 ROB         1800     40
I               28-11-07 00:05:03    1590771  7782 CLARK       2450     10
U               28-11-07 00:05:03    1590778  7782 Clark       2450     10
I               28-11-07 00:05:03    1590771  7788 SCOTT       3000     20
U               28-11-07 00:05:03    1590774  7788 SCOTT       3300     20
U               28-11-07 00:05:03    1590778  7788 Scott       3300     20
D               28-11-07 00:05:03    1590781  7788 Scott       3300     20
I               28-11-07 00:05:03    1590771  7839 KING        5000     10
U               28-11-07 00:05:03    1590778  7839 King        5000     10
I               28-11-07 00:05:03    1590771  7844 TURNER      1500     30
I               28-11-07 00:05:03    1590771  7876 ADAMS       1100     20
U               28-11-07 00:05:03    1590774  7876 ADAMS       1210     20
U               28-11-07 00:05:03    1590778  7876 Adams       1210     20
I               28-11-07 00:05:03    1590771  7900 JAMES        950     30
I               28-11-07 00:05:03    1590771  7902 FORD        3000     20
U               28-11-07 00:05:03    1590774  7902 FORD        3300     20
U               28-11-07 00:05:03    1590778  7902 Ford        3300     20
D               28-11-07 00:05:03    1590781  7902 Ford        3300     20
I               28-11-07 00:05:03    1590771  7934 MILLER      1300     10
U               28-11-07 00:05:03    1590778  7934 Miller      1300     10

33 rijen zijn geselecteerd.

Note that I also used the SCN in the view, to be able to sort even when the transactions occur within one second.

So it looks very promising... well, until you read this warning in the documentation

Remember that DDLs that alter the structure of a table (such as drop/modify column, move table, drop partition, truncate table/partition, and add constraint) invalidate any existing undo data for the table. If you try to retrieve data from a time before such a DDL executed, you will get error ORA-1466.

Now this is a serious showstopper: in my experience the tables that need journaling are also the tables that require regular maintenance. And if that maintenance would invalidate existing undo ...

Let's test it (and here is why I range partitioned the table):

rwijk@ORA11G> alter table my_emp drop column comm
  2  /
alter table my_emp drop column comm
*
FOUT in regel 1:
.ORA-55610: Ongeldig DDL-statement op tabel met historietracking


rwijk@ORA11G> alter table my_emp modify (ename varchar2(50))
  2  /
alter table my_emp modify (ename varchar2(50))
*
FOUT in regel 1:
.ORA-55610: Ongeldig DDL-statement op tabel met historietracking


rwijk@ORA11G> alter table my_emp drop partition deptno10
  2  /
alter table my_emp drop partition deptno10
*
FOUT in regel 1:
.ORA-55610: Ongeldig DDL-statement op tabel met historietracking


rwijk@ORA11G> truncate table my_emp
  2  /
truncate table my_emp
               *
FOUT in regel 1:
.ORA-55610: Ongeldig DDL-statement op tabel met historietracking

So Oracle does not even allow us to do this type of actions, and the existing undo is not harmed:

rwijk@ORA11G> select *
  2    from my_emp_jn_v
  3   order by empno
  4       , jn_scn
  5  /

J JN_USER       JN_DATETIME           JN_SCN EMPNO ENAME        SAL DEPTNO
- ------------- ------------------- -------- ----- ---------- ----- ------
I               28-11-07 00:05:03    1590771  7369 SMITH        800     20
U               28-11-07 00:05:03    1590774  7369 SMITH        880     20
U               28-11-07 00:05:03    1590778  7369 Smith        880     20
I               28-11-07 00:05:03    1590771  7499 ALLEN       1600     30
I               28-11-07 00:05:03    1590771  7521 WARD        1250     30
I               28-11-07 00:05:03    1590771  7566 JONES       2975     20
U               28-11-07 00:05:03    1590774  7566 JONES       3273     20
U               28-11-07 00:05:03    1590778  7566 Jones       3273     20
I               28-11-07 00:05:03    1590771  7654 MARTIN      1250     30
I               28-11-07 00:05:03    1590771  7698 BLAKE       2850     30
I               28-11-07 00:05:03    1590781  7777 VAN WIJK    1600     40
D SOMEBODY_ELSE 28-11-07 00:05:21    1590838  7777 VAN WIJK    1600     40
I RWIJK         28-11-07 00:05:06    1590799  7778 ROB         1700     40
U RWIJK         28-11-07 00:05:33    1590859  7778 ROB         1800     40
I               28-11-07 00:05:03    1590771  7782 CLARK       2450     10
U               28-11-07 00:05:03    1590778  7782 Clark       2450     10
I               28-11-07 00:05:03    1590771  7788 SCOTT       3000     20
U               28-11-07 00:05:03    1590774  7788 SCOTT       3300     20
U               28-11-07 00:05:03    1590778  7788 Scott       3300     20
D               28-11-07 00:05:03    1590781  7788 Scott       3300     20
I               28-11-07 00:05:03    1590771  7839 KING        5000     10
U               28-11-07 00:05:03    1590778  7839 King        5000     10
I               28-11-07 00:05:03    1590771  7844 TURNER      1500     30
I               28-11-07 00:05:03    1590771  7876 ADAMS       1100     20
U               28-11-07 00:05:03    1590774  7876 ADAMS       1210     20
U               28-11-07 00:05:03    1590778  7876 Adams       1210     20
I               28-11-07 00:05:03    1590771  7900 JAMES        950     30
I               28-11-07 00:05:03    1590771  7902 FORD        3000     20
U               28-11-07 00:05:03    1590774  7902 FORD        3300     20
U               28-11-07 00:05:03    1590778  7902 Ford        3300     20
D               28-11-07 00:05:03    1590781  7902 Ford        3300     20
I               28-11-07 00:05:03    1590771  7934 MILLER      1300     10
U               28-11-07 00:05:03    1590778  7934 Miller      1300     10

33 rijen zijn geselecteerd.

Most of the time, maintenance is done by adding one or two columns, with some additional constraints. The adding of a column is not a problem for flashback archives, but adding constraints was also listed as being impossible. Let's verify that as well, by adding a primary key, unique key, foreign key and a check constraint:

rwijk@ORA11G> alter table my_emp add constraint my_pk primary key (empno)
  2  /

Tabel is gewijzigd.

rwijk@ORA11G> alter table my_emp add constraint my_uk1 unique (ename)
  2  /

Tabel is gewijzigd.

rwijk@ORA11G> alter table my_emp add constraint my_fk1 foreign key (deptno) references dept(deptno)
  2  /

Tabel is gewijzigd.

rwijk@ORA11G> alter table my_emp add constraint ck1 check (sal>500)
  2  /

Tabel is gewijzigd.

So, the documentation was wrong? Is it possible after all?

No.

Because the undo has been sneakily invalidated:

rwijk@ORA11G> select *
  2    from my_emp_jn_v
  3   order by empno
  4       , jn_scn
  5  /

J JN_USER       JN_DATETIME           JN_SCN EMPNO ENAME        SAL DEPTNO
- ------------- ------------------- -------- ----- ---------- ----- ------
                                              7369 Smith        880     20
                                              7499 ALLEN       1600     30
                                              7521 WARD        1250     30
                                              7566 Jones       3273     20
                                              7654 MARTIN      1250     30
                                              7698 BLAKE       2850     30
                                              7778 ROB         1800     40
                                              7782 Clark       2450     10
                                              7839 King        5000     10
                                              7844 TURNER      1500     30
                                              7876 Adams       1210     20
                                              7900 JAMES        950     30
                                              7934 Miller      1300     10

13 rijen zijn geselecteerd.

So if you make the mistake of adding a constraint, you lose all journaling information... This sure is something to be aware of. I think the adding of constraints to a table with a flashback archive should also raise an ORA-55610. Hopefully Oracle will fix this in a future release or patch set.

The only way I can think of to keep journaling information while still being able to do the necessary maintenance, is to create a journaling table in which you dump the information from the journaling view before invalidating the existing undo. Far from ideal, but doable as a workaround.

But what do we gain by implementing journaling this way?

The answer is simple: performance. No longer is it necessary to have database triggers. And because database triggers are evil, you win a lot, as can be seen in a little test in which I create 3 tables, one regular table, one table using a flashback archive and auditing, and one regular table with an after row trigger to populate a second journaling table:

rwijk@ORA11G> create flashback archive flashback_archive_10_years
  2  tablespace my_tablespace
  3  retention 10 year
  4  /

Flashbackarchief is aangemaakt.

rwijk@ORA11G> create table my_emp_regular
  2  ( empno    number(6) not null
  3  , ename    varchar2(10)
  4  , sal      number(8,2)
  5  , deptno   number(2)
  6  )
  7  /

Tabel is aangemaakt.

rwijk@ORA11G> create table my_emp_flashback_archive
  2  ( empno    number(6) not null
  3  , ename    varchar2(10)
  4  , sal      number(8,2)
  5  , deptno   number(2)
  6  ) flashback archive flashback_archive_10_years
  7  /

Tabel is aangemaakt.

rwijk@ORA11G> audit insert,update,delete on my_emp_flashback_archive by access
  2  /

Audit is geslaagd.

rwijk@ORA11G> create table my_emp_bad_old_journaling
  2  ( empno    number(6) not null
  3  , ename    varchar2(10)
  4  , sal      number(8,2)
  5  , deptno   number(2)
  6  )
  7  /

Tabel is aangemaakt.

rwijk@ORA11G> create table my_emp_bad_old_journaling_jn
  2  ( jn_operation varchar2(1)
  3  , jn_user      varchar2(30)
  4  , jn_datetime  date
  5  , jn_scn       number
  6  , empno        number(6) not null
  7  , ename        varchar2(10)
  8  , sal          number(8,2)
  9  , deptno       number(2)
 10  )
 11  /

Tabel is aangemaakt.

rwijk@ORA11G> create trigger jn_ari
  2  after insert on my_emp_bad_old_journaling
  3  for each row
  4  begin
  5    insert into my_emp_bad_old_journaling_jn
  6    ( jn_operation
  7    , jn_user
  8    , jn_datetime
  9    , jn_scn
 10    , empno
 11    , ename
 12    , sal
 13    , deptno
 14    )
 15    values
 16    ( 'I'
 17    , user
 18    , sysdate
 19    , to_number(dbms_flashback.get_system_change_number)
 20    , :new.empno
 21    , :new.ename
 22    , :new.sal
 23    , :new.deptno
 24    );
 25  end;
 26  /

Trigger is aangemaakt.

rwijk@ORA11G> set timing on
rwijk@ORA11G> insert into my_emp_regular
  2   select level
  3        , 'EMP' || to_char(level)
  4        , 1000 + level
  5        , 10 * (1 + mod(level,4))
  6     from dual
  7  connect by level <= 100000
  8  /

100000 rijen zijn aangemaakt.

Verstreken: 00:00:00.48
rwijk@ORA11G> insert into my_emp_flashback_archive
  2   select level
  3        , 'EMP' || to_char(level)
  4        , 1000 + level
  5        , 10 * (1 + mod(level,4))
  6     from dual
  7  connect by level <= 100000
  8  /

100000 rijen zijn aangemaakt.

Verstreken: 00:00:00.50
rwijk@ORA11G> insert into my_emp_bad_old_journaling
  2   select level
  3        , 'EMP' || to_char(level)
  4        , 1000 + level
  5        , 10 * (1 + mod(level,4))
  6     from dual
  7  connect by level <= 100000
  8  /

100000 rijen zijn aangemaakt.

Verstreken: 00:00:13.39

As expected, almost no overhead using flashback data archives and therefore a lot faster than using the old fashioned way of journaling. But the maintenance thing described earlier is something to worry about in advance.

Further reading:

Yuri van Buren about Flashback Data Archive

Oracle 11g Total Recall - Flashback in the hands of Database Designers and Application Developers, at last. And: the end of Journalling Tables by Lucas Jellema

Friday, November 16, 2007

SQL Model Unconference

In an "Unconference" session at Oracle Open World, Laurent Schneider did a session about the SQL Model Clause. He posted his demonstration script on his site here.

If you have been reading my SQL Model Clause Tutorial, part one and SQL Model Clause Tutorial, part two, then I can recommend you to run this script as well and see some interesting queries. You only need the EMP and DEPT tables. A nice catch for me was to have a constant in the PARTITION BY clause.

Thursday, November 15, 2007

SAVE EXCEPTIONS

Since version 9 it is possible to do bulk DML using FORALL and use the SAVE EXCEPTIONS clause. It makes sure that all invalid rows - the exceptions - are saved into the sql%bulk_exceptions array, while all valid rows are still processed. This array stores a record for each invalid row, containing an ERROR_INDEX which is the iteration number during the FORALL statement and an ERROR_CODE which corresponds with the Oracle error code.

The error text is not stored. The documentation says:

The individual error messages, or any substitution arguments, are not saved, but the error message text can looked up using ERROR_CODE with SQLERRM ...

Looks reasonable, but in our shop we validate lots of business rules with triggers. When a business rule is violated we do a RAISE_APPLICATION_ERROR(-20000,'APP-12345');

At client side (Webforms) the error message is looked up in the messages table and a friendly message is given. When coding a FORALL with SAVE EXCEPTIONS in such an environment, the error messages become useless as can be seen in the next example:

rwijk@ORA11G> create table mytable
  2  ( id number(4)
  3  , name varchar2(30)
  4  )
  5  /

Tabel is aangemaakt.

rwijk@ORA11G> create trigger mytable_bri
  2  before insert on mytable
  3  for each row
  4  begin
  5    if :new.id = 2
  6    then
  7      raise_application_error(-20000,'APP-12345');
  8    elsif :new.id = 9
  9    then
 10      raise_application_error(-20000,'APP-98765');
 11    end if;
 12  end;
 13  /

Trigger is aangemaakt.

rwijk@ORA11G> alter table mytable add constraint mytable_ck1 check (id <> 6)
  2  /

Tabel is gewijzigd.

rwijk@ORA11G> declare
  2    e_forall_error exception;
  3    pragma exception_init(e_forall_error,-24381)
  4    ;
  5    type t_numbers is table of mytable.id%type;
  6    l_numbers t_numbers := t_numbers(1,2,3,4,5,6,7,8,9,10)
  7    ;
  8  begin
  9    forall i in 1..l_numbers.count save exceptions
 10      insert into mytable
 11      ( id
 12      , name
 13      )
 14      values
 15      ( l_numbers(i)
 16      , 'Name' || to_char(l_numbers(i))
 17      )
 18    ;
 19  exception
 20  when e_forall_error then
 21    for i in 1..sql%bulk_exceptions.count
 22    loop
 23      dbms_output.put_line('SQLCODE: ' || sql%bulk_exceptions(i).error_code);
 24      dbms_output.put_line('SQLERRM: ' || sqlerrm(-sql%bulk_exceptions(i).error_code));
 25      dbms_output.new_line;
 26    end loop;
 27  end;
 28  /
SQLCODE: 20000
SQLERRM: ORA-20000:

SQLCODE: 2290
SQLERRM: ORA-02290: CHECK-beperking (.) is geschonden.

SQLCODE: 20000
SQLERRM: ORA-20000:


PL/SQL-procedure is geslaagd.

rwijk@ORA11G> select id, name from mytable
  2  /

        ID NAME
---------- ------------------------------
         1 Name1
         3 Name3
         4 Name4
         5 Name5
         7 Name7
         8 Name8
        10 Name10

7 rijen zijn geselecteerd.

Note how the SQLERRM message doesn't return anything useful and that the name of the check constraint has disappeared. This is really annoying and can't be circumvented easily in 9i. For better error messages we would have to go back to row by row processing. And that means: very slow.

However, version 10gR2 introduced a feature called DML error logging. I remember reading about it more than two years ago here on Tom Kyte's blog. In this entry and in the documentation you only see examples using SQL, not PL/SQL examples using FORALL. But luckily this works as well:

rwijk@ORA11G> rollback
  2  /

Rollback is voltooid.

rwijk@ORA11G> exec dbms_errlog.create_error_log('mytable')

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> declare
  2    type t_numbers is table of mytable.id%type;
  3    l_numbers t_numbers := t_numbers(1,2,3,4,5,6,7,8,9,10)
  4    ;
  5  begin
  6    forall i in 1..l_numbers.count
  7      insert into mytable
  8      ( id
  9      , name
 10      )
 11      values
 12      ( l_numbers(i)
 13      , 'Name' || to_char(l_numbers(i))
 14      )
 15      log errors reject limit unlimited
 16    ;
 17    for r in
 18    ( select ora_err_number$
 19           , ora_err_mesg$
 20        from err$_mytable
 21    )
 22    loop
 23      dbms_output.put_line('SQLCODE: ' || to_char(r.ora_err_number$));
 24      dbms_output.put_line('SQLERRM: ' || r.ora_err_mesg$);
 25      dbms_output.new_line;
 26    end loop
 27    ;
 28  end;
 29  /
SQLCODE: 20000
SQLERRM: ORA-20000: APP-12345
ORA-06512: in "RWIJK.MYTABLE_BRI", regel 4
ORA-04088: Fout bij uitvoering van trigger 'RWIJK.MYTABLE_BRI'
.


SQLCODE: 2290
SQLERRM: ORA-02290: CHECK-beperking (RWIJK.MYTABLE_CK1) is geschonden.


SQLCODE: 20000
SQLERRM: ORA-20000: APP-98765
ORA-06512: in "RWIJK.MYTABLE_BRI", regel 7
ORA-04088: Fout bij uitvoering van trigger 'RWIJK.MYTABLE_BRI'
.



PL/SQL-procedure is geslaagd.

rwijk@ORA11G> select id, name from mytable
  2  /

        ID NAME
---------- ------------------------------
         1 Name1
         3 Name3
         4 Name4
         5 Name5
         7 Name7
         8 Name8
        10 Name10

7 rijen zijn geselecteerd.

And you do get to see the error messages (APP-12345 and APP-98765) and the name of the check constraint. Unfortunately, our shop still uses 9.2.0.7...

UPDATE

It is worth knowing that there are some restrictions when working with the DML error logging clause. From the 11g documentation (same as in 10gR2 documentation):

Restrictions on DML Error Logging

* The following conditions cause the statement to fail and roll back without invoking the error logging capability:

o Violated deferred constraints.

o Any direct-path INSERT or MERGE operation that raises a unique constraint or index violation.

o Any update operation UPDATE or MERGE that raises a unique constraint or index violation.

* You cannot track errors in the error logging table for LONG, LOB, or object type columns. However, the table that is the target of the DML operation can contain these types of columns.

o If you create or modify the corresponding error logging table so that it contains a column of an unsupported type, and if the name of that column corresponds to an unsupported column in the target DML table, then the DML statement fails at parse time.

o If the error logging table does not contain any unsupported column types, then all DML errors are logged until the reject limit of errors is reached. For rows on which errors occur, column values with corresponding columns in the error logging table are logged along with the control information.

Wednesday, November 14, 2007

Interval based row generation

Well, I didn't have a good name for this common SQL problem, so I decided to call it "interval based row generation". If someone out there knows a better or more common name for the problem, then please say so by leaving a comment.

An example says more than a thousand words, so here is an example of what I mean by the term:

Suppose I have this table:

rwijk@ORA10GR2> create table mytable (code,start_date,end_date,value)
  2  as
  3  select 'group1', date '2007-01-01', date '2007-01-10', 11 from dual union all
  4  select 'group2', date '2007-01-01', date '2007-01-01', 22 from dual union all
  5  select 'group3', date '2007-01-01', date '2007-01-03', 33 from dual
  6  /

Tabel is aangemaakt.

And I want to generate rows for all dates between and including the start_date and the end_date. The query would have to generate 10 rows for group1, 1 row for group2 and 3 rows for group3. I know of six ways to do this type of query. There may be more of course. Here are the six alternatives:

rwijk@ORA10GR2> select /* alternative 1: cartesian product with hard coded upper bound */
  2         m.code
  3       , m.start_date + l as day
  4       , m.value
  5    from mytable m
  6       , (select level-1 l from dual connect by level <= 1000)
  7   where l <= m.end_date - m.start_date
  8   order by m.code
  9       , day
 10  /

CODE       DAY                      VALUE
---------- ------------------- ----------
group1     01-01-2007 00:00:00         11
group1     02-01-2007 00:00:00         11
group1     03-01-2007 00:00:00         11
group1     04-01-2007 00:00:00         11
group1     05-01-2007 00:00:00         11
group1     06-01-2007 00:00:00         11
group1     07-01-2007 00:00:00         11
group1     08-01-2007 00:00:00         11
group1     09-01-2007 00:00:00         11
group1     10-01-2007 00:00:00         11
group2     01-01-2007 00:00:00         22
group3     01-01-2007 00:00:00         33
group3     02-01-2007 00:00:00         33
group3     03-01-2007 00:00:00         33

14 rijen zijn geselecteerd.

rwijk@ORA10GR2> select /* alternative 2: cartesian product with calculated upper bound */
  2         m.code
  3       , m.start_date + l as day
  4       , m.value
  5    from mytable m
  6       , ( select level-1 l
  7             from (select max(end_date-start_date+1) maxinterval from mytable)
  8          connect by level <= maxinterval
  9         )
 10   where l <= m.end_date - m.start_date
 11   order by m.code
 12       , day
 13  /

CODE       DAY                      VALUE
---------- ------------------- ----------
group1     01-01-2007 00:00:00         11
group1     02-01-2007 00:00:00         11
group1     03-01-2007 00:00:00         11
group1     04-01-2007 00:00:00         11
group1     05-01-2007 00:00:00         11
group1     06-01-2007 00:00:00         11
group1     07-01-2007 00:00:00         11
group1     08-01-2007 00:00:00         11
group1     09-01-2007 00:00:00         11
group1     10-01-2007 00:00:00         11
group2     01-01-2007 00:00:00         22
group3     01-01-2007 00:00:00         33
group3     02-01-2007 00:00:00         33
group3     03-01-2007 00:00:00         33

14 rijen zijn geselecteerd.

rwijk@ORA10GR2>  select /* alternative 3: hierarchical query with prior dbms_random.value */
  2          m.code
  3        , m.start_date + level - 1 as day
  4        , m.value
  5     from mytable m
  6  connect by level <= m.end_date - m.start_date + 1
  7      and prior m.code = m.code
  8      and prior dbms_random.value is not null
  9    order by m.code
 10        , day
 11  /

CODE       DAY                      VALUE
---------- ------------------- ----------
group1     01-01-2007 00:00:00         11
group1     02-01-2007 00:00:00         11
group1     03-01-2007 00:00:00         11
group1     04-01-2007 00:00:00         11
group1     05-01-2007 00:00:00         11
group1     06-01-2007 00:00:00         11
group1     07-01-2007 00:00:00         11
group1     08-01-2007 00:00:00         11
group1     09-01-2007 00:00:00         11
group1     10-01-2007 00:00:00         11
group2     01-01-2007 00:00:00         22
group3     01-01-2007 00:00:00         33
group3     02-01-2007 00:00:00         33
group3     03-01-2007 00:00:00         33

14 rijen zijn geselecteerd.

rwijk@ORA10GR2>  select /* alternative 4: hierarchical query with connect_by_root */
  2          m.code
  3        , m.start_date + level - 1 as day
  4        , m.value
  5     from mytable m
  6  connect by connect_by_root m.code = m.code
  7      and level <= m.end_date - m.start_date + 1
  8    order by m.code
  9        , day
 10  /

CODE       DAY                      VALUE
---------- ------------------- ----------
group1     01-01-2007 00:00:00         11
group1     02-01-2007 00:00:00         11
group1     03-01-2007 00:00:00         11
group1     04-01-2007 00:00:00         11
group1     05-01-2007 00:00:00         11
group1     06-01-2007 00:00:00         11
group1     07-01-2007 00:00:00         11
group1     08-01-2007 00:00:00         11
group1     09-01-2007 00:00:00         11
group1     10-01-2007 00:00:00         11
group2     01-01-2007 00:00:00         22
group3     01-01-2007 00:00:00         33
group3     02-01-2007 00:00:00         33
group3     03-01-2007 00:00:00         33

14 rijen zijn geselecteerd.

rwijk@ORA10GR2> select /* alternative 5: Adrian Billington's multiset */
  2         m.code
  3       , m.start_date + (row_number() over (partition by m.code order by null)-1) as day
  4       , m.value
  5    from mytable m
  6       , table
  7         ( cast
  8           ( multiset(select null from dual connect by rownum <= (m.end_date-m.start_date)+1)
  9             as sys.dbms_debug_vc2coll
 10           )
 11         ) t
 12   order by m.code
 13       , day
 14  /

CODE       DAY                      VALUE
---------- ------------------- ----------
group1     01-01-2007 00:00:00         11
group1     02-01-2007 00:00:00         11
group1     03-01-2007 00:00:00         11
group1     04-01-2007 00:00:00         11
group1     05-01-2007 00:00:00         11
group1     06-01-2007 00:00:00         11
group1     07-01-2007 00:00:00         11
group1     08-01-2007 00:00:00         11
group1     09-01-2007 00:00:00         11
group1     10-01-2007 00:00:00         11
group2     01-01-2007 00:00:00         22
group3     01-01-2007 00:00:00         33
group3     02-01-2007 00:00:00         33
group3     03-01-2007 00:00:00         33

14 rijen zijn geselecteerd.

rwijk@ORA10GR2> select /* alternative 6: model clause */
  2         code
  3       , day
  4       , value
  5    from mytable
  6   model
  7         partition by (code,value)
  8         dimension by (0 i)
  9         measures (start_date day, end_date)
 10         rules
 11         ( day[for i from 1 to end_date[0] - day[0] increment 1] = day[0] + cv(i)
 12         )
 13   order by code
 14       , day
 15  /

CODE       DAY                      VALUE
---------- ------------------- ----------
group1     01-01-2007 00:00:00         11
group1     02-01-2007 00:00:00         11
group1     03-01-2007 00:00:00         11
group1     04-01-2007 00:00:00         11
group1     05-01-2007 00:00:00         11
group1     06-01-2007 00:00:00         11
group1     07-01-2007 00:00:00         11
group1     08-01-2007 00:00:00         11
group1     09-01-2007 00:00:00         11
group1     10-01-2007 00:00:00         11
group2     01-01-2007 00:00:00         22
group3     01-01-2007 00:00:00         33
group3     02-01-2007 00:00:00         33
group3     03-01-2007 00:00:00         33

14 rijen zijn geselecteerd.

I noticed something remarkable when running the same queries on my Oracle11g database. Alternatives 1, 2, 5 and 6 ran as expected, but 3 & 4 did not work anymore:

rwijk@ORA11G>  select /* alternative 3: hierarchical query with prior dbms_random.value */
  2          m.code
  3        , m.start_date + level - 1 as day
  4        , m.value
  5     from mytable m
  6  connect by level <= m.end_date - m.start_date + 1
  7      and prior m.code = m.code
  8      and prior dbms_random.value is not null
  9    order by m.code
 10        , day
 11  /
   from mytable m
        *
FOUT in regel 5:
.ORA-01436: CONNECT BY-lus in gebruikersgegevens.


rwijk@ORA11G>  select /* alternative 4: hierarchical query with connect_by_root */
  2          m.code
  3        , m.start_date + level - 1 as day
  4        , m.value
  5     from mytable m
  6  connect by connect_by_root m.code = m.code
  7      and level <= m.end_date - m.start_date + 1
  8    order by m.code
  9        , day
 10  /
connect by connect_by_root m.code = m.code
           *
FOUT in regel 6:
.ORA-30007: Operator CONNECT BY ROOT wordt niet ondersteund in de voorwaarde START WITH of  CONNECT BY.

Alternative 3 uses a trick: line 8 prevented the ORA-01436 connect by loop in user data error in versions 9 and 10, but not anymore in 11.

Alternative 4 shows a new error message introduced in Oracle11g: ORA-30007. Apparently the use of the connect_by_root function has been restricted more. I don't know the reason why, although Oracle probably will have a good reason for the restriction.

In this recent OTN-thread Adrian Billington (of the excellent oracle-developer.net site) showed me alternative 5, which I had not seen before. He also did a small performance test showing that alternative 5 is slightly better than alternative 6, the model clause solution. I decided to investigate a little more by tracing/tkprof and by comparing latches with Tom Kyte's runstats_pkg.

Here are the results of a test with this table:

create table mytable (code,start_date,end_date,value)
as
 select 'group' || to_char(level)
      , trunc(sysdate)
      , trunc(sysdate) + power(ceil(dbms_random.value(0,30)),2)
      , 'value' || to_char(level)
   from dual
connect by level <= 100
/

Tkprof on Oracle11g shows:

********************************************************************************

select /* alternative 1: cartesian product with hard coded upper bound */
       m.code
     , m.start_date + l as day
     , m.value
  from mytable m
     , (select level-1 l from dual connect by level <= 1000)
 where l <= m.end_date - m.start_date
 order by m.code
     , day

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     2415      0.31       0.29          0       3000          0       36197
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     2417      0.31       0.29          0       3001          0       36197

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 81  

Rows     Row Source Operation
-------  ---------------------------------------------------
  36197  SORT ORDER BY (cr=3000 pr=0 pw=0 time=806 us cost=6 size=395 card=5)
  36197   NESTED LOOPS  (cr=3000 pr=0 pw=0 time=2220 us cost=5 size=395 card=5)
   1000    VIEW  (cr=0 pr=0 pw=0 time=67 us cost=2 size=13 card=1)
   1000     CONNECT BY WITHOUT FILTERING (cr=0 pr=0 pw=0 time=27 us)
      1      FAST DUAL  (cr=0 pr=0 pw=0 time=0 us cost=2 size=0 card=1)
  36197    TABLE ACCESS FULL MYTABLE (cr=3000 pr=0 pw=0 time=4259 us cost=3 size=330 card=5)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    2415        0.00          0.00
  SQL*Net message from client                  2415        0.00          0.52
********************************************************************************

select /* alternative 2: cartesian product with calculated upper bound */
       m.code
     , m.start_date + l as day
     , m.value
  from mytable m
     , ( select level-1 l
           from (select max(end_date-start_date+1) maxinterval from mytable)
        connect by level <= maxinterval
       )
 where l <= m.end_date - m.start_date
 order by m.code
     , day

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          3          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     2415      0.26       0.28          0       2706          0       36197
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     2417      0.28       0.29          0       2709          0       36197

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 81  

Rows     Row Source Operation
-------  ---------------------------------------------------
  36197  SORT ORDER BY (cr=2706 pr=0 pw=0 time=797 us cost=7 size=395 card=5)
  36197   NESTED LOOPS  (cr=2706 pr=0 pw=0 time=2251 us cost=6 size=395 card=5)
    901    VIEW  (cr=3 pr=0 pw=0 time=66 us cost=3 size=13 card=1)
    901     CONNECT BY WITHOUT FILTERING (cr=3 pr=0 pw=0 time=28 us)
      1      VIEW  (cr=3 pr=0 pw=0 time=0 us cost=3 size=13 card=1)
      1       SORT AGGREGATE (cr=3 pr=0 pw=0 time=0 us)
    100        TABLE ACCESS FULL MYTABLE (cr=3 pr=0 pw=0 time=2 us cost=3 size=1800 card=100)
  36197    TABLE ACCESS FULL MYTABLE (cr=2703 pr=0 pw=0 time=4266 us cost=3 size=330 card=5)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    2415        0.00          0.00
  SQL*Net message from client                  2415        0.00          0.52
********************************************************************************

select /* alternative 5: Adrian Billington's multiset */
       m.code
     , m.start_date + (row_number() over (partition by m.code order by null)-1) as day
     , m.value
  from mytable m
     , table
       ( cast
         ( multiset(select null from dual connect by rownum <= (m.end_date-m.start_date)+1)
           as sys.dbms_debug_vc2coll
         )
       ) t
 order by m.code
     , day

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     2415      0.35       0.36          0          3          0       36197
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     2417      0.35       0.36          0          4          0       36197

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 81  

Rows     Row Source Operation
-------  ---------------------------------------------------
  36197  SORT ORDER BY (cr=3 pr=0 pw=0 time=749 us cost=28444 size=53908800 card=816800)
  36197   WINDOW SORT (cr=3 pr=0 pw=0 time=902 us cost=28444 size=53908800 card=816800)
  36197    NESTED LOOPS  (cr=3 pr=0 pw=0 time=4381 us cost=2726 size=53908800 card=816800)
    100     TABLE ACCESS FULL MYTABLE (cr=3 pr=0 pw=0 time=4 us cost=3 size=6600 card=100)
  36197     COLLECTION ITERATOR SUBQUERY FETCH (cr=0 pr=0 pw=0 time=4111 us)
  36197      COUNT  (cr=0 pr=0 pw=0 time=2600 us)
  36197       CONNECT BY WITHOUT FILTERING (cr=0 pr=0 pw=0 time=962 us)
    100        FAST DUAL  (cr=0 pr=0 pw=0 time=0 us cost=2 size=0 card=1)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    2415        0.00          0.00
  SQL*Net message from client                  2415        0.00          0.53
********************************************************************************

select /* alternative 6: model clause */
       code
     , day
     , value
  from mytable
 model
       partition by (code,value)
       dimension by (0 i)
       measures (start_date day, end_date)
       rules
       ( day[for i from 1 to end_date[0] - day[0] increment 1] = day[0] + cv(i)
       )
 order by code
     , day

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     2415      0.46       0.51          0          3          0       36197
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     2417      0.46       0.51          0          4          0       36197

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 81  

Rows     Row Source Operation
-------  ---------------------------------------------------
  36197  SORT ORDER BY (cr=3 pr=0 pw=0 time=771 us cost=4 size=6600 card=100)
  36197   SQL MODEL ORDERED (cr=3 pr=0 pw=0 time=697 us cost=4 size=6600 card=100)
    100    TABLE ACCESS FULL MYTABLE (cr=3 pr=0 pw=0 time=3 us cost=3 size=6600 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    2415        0.00          0.00
  SQL*Net message from client                  2415        0.00          0.53

********************************************************************************

So, the clunky cartesian joins - alternatives 1 and 2 - seem to outperform the multiset and model clause solutions. These alternatives just generate all rows possibly needed and filter out the unneeded ones.

Next I compared alternative 2 with 5 using runstats-pkg:

rwijk@ORA11G> exec runstats_pkg.rs_stop(50)
Run1 draaide in 78 hsecs
Run2 draaide in 86 hsecs
Run1 draaide in 90,7% van de tijd
 
Naam                                                      Run1        Run2    Verschil
LATCH.shared pool                                          156         222          66
LATCH.enqueues                                              80          14         -66
LATCH.enqueue hash chains                                   82          15         -67
STAT.undo change vector size                             2,732       2,804          72
STAT.redo size                                           3,656       3,740          84
STAT.sorts (memory)                                          2         102         100
STAT.workarea executions - optimal                           6         107         101
STAT.no work - consistent read gets                        910           9        -901
STAT.table scan blocks gotten                              902           1        -901
STAT.table scans (short tables)                            902           1        -901
STAT.calls to get snapshot scn: kcmgss                   1,811           9      -1,802
LATCH.cache buffers chains                               2,856         163      -2,693
STAT.session logical reads                               2,753          53      -2,700
STAT.consistent gets                                     2,727          26      -2,701
STAT.consistent gets from cache                          2,727          26      -2,701
STAT.consistent gets from cache (fastpath)               2,720          18      -2,702
STAT.sorts (rows)                                       36,207      72,512      36,305
STAT.table scan rows gotten                             90,200         100     -90,100
 
Run1 latches totaal versus run2 -- verschil en percentage
      Run1      Run2  Verschil     Pct
     8,692     5,907    -2,785 147.15%

So alternative 2 is faster wall clock wise, but is less scalabe than alternative 5.

Now comparing 5 and 6:

rwijk@ORA11G> exec runstats_pkg.rs_stop(50)
Run1 draaide in 73 hsecs
Run2 draaide in 89 hsecs
Run1 draaide in 82,02% van de tijd
 
Naam                                                      Run1        Run2    Verschil
STAT.undo change vector size                             2,868       2,804         -64
LATCH.SQL memory manager workarea list latch                74           9         -65
STAT.bytes received via SQL*Net from client             24,549      24,484         -65
LATCH.enqueues                                               2          76          74
LATCH.enqueue hash chains                                    3          78          75
STAT.redo size                                           3,876       3,784         -92
STAT.workarea executions - optimal                         107           6        -101
STAT.sorts (memory)                                        102           1        -101
STAT.sorts (rows)                                       62,756      31,328     -31,428
 
Run1 latches totaal versus run2 -- verschil en percentage
      Run1      Run2  Verschil     Pct
     4,700     4,811       111  97.69%

Conclusion is that Adrian's multiset solution is the close winner. The cartesian product solution will appeal more when the difference in intervals is small and the percentage of filtered rows is small. The model clause solution only wins in readability I guess, although some of you probably won't agree :-)

About Oracle

Pages