Pages

Wednesday, January 30, 2008

Date format element 'ww'

We all know what the 'ww' date format element does, right? It provides the week number such that January 1 is always in week 1. The 'iw' element is a little different in that January 1 can be week 53 of the previous year if it is a Friday, Saturday or Sunday.

Today, my boss - who has worked with Oracle for 18 years now - was surprised to see that Monday, January 28 2008 is week 4 and Tuesday, January 29 2008 is week 5. He asked me to have a look and I was surprised too. My boss and me both seem to have made the same false assumption for all these years. It is the assumption that a week runs from Monday until Sunday, and that all days in between have the same week number. In our assumption week 1 of 2008 contains 6 days from Tuesday January 1 2008 until Sunday January 6 2008. And week 2 would again be "normal" and runs from Monday January 7 2008 and Sunday January 13 2008.

So I ran a few queries like this to verify this assumption:

rwijk@ORA11G> alter session set nls_date_language = 'american'
2 /

Sessie is gewijzigd.

rwijk@ORA11G> select to_char
2 ( date '2003-12-20' + level
3 , 'day yyyy-mm-dd yyyy ww iyyy iw'
4 )
5 from dual
6 connect by level <= 42
7 /

TO_CHAR(DATE'2003-12-20'+LEVEL,'DAYYYY
--------------------------------------
sunday 2003-12-21 2003 51 2003 51
monday 2003-12-22 2003 51 2003 52
tuesday 2003-12-23 2003 51 2003 52
wednesday 2003-12-24 2003 52 2003 52
thursday 2003-12-25 2003 52 2003 52
friday 2003-12-26 2003 52 2003 52
saturday 2003-12-27 2003 52 2003 52
sunday 2003-12-28 2003 52 2003 52
monday 2003-12-29 2003 52 2004 01
tuesday 2003-12-30 2003 52 2004 01
wednesday 2003-12-31 2003 53 2004 01
thursday 2004-01-01 2004 01 2004 01
friday 2004-01-02 2004 01 2004 01
saturday 2004-01-03 2004 01 2004 01
sunday 2004-01-04 2004 01 2004 01
monday 2004-01-05 2004 01 2004 02
tuesday 2004-01-06 2004 01 2004 02
wednesday 2004-01-07 2004 01 2004 02
thursday 2004-01-08 2004 02 2004 02
friday 2004-01-09 2004 02 2004 02
saturday 2004-01-10 2004 02 2004 02
sunday 2004-01-11 2004 02 2004 02
monday 2004-01-12 2004 02 2004 03
tuesday 2004-01-13 2004 02 2004 03
wednesday 2004-01-14 2004 02 2004 03
thursday 2004-01-15 2004 03 2004 03
friday 2004-01-16 2004 03 2004 03
saturday 2004-01-17 2004 03 2004 03
sunday 2004-01-18 2004 03 2004 03
monday 2004-01-19 2004 03 2004 04
tuesday 2004-01-20 2004 03 2004 04
wednesday 2004-01-21 2004 03 2004 04
thursday 2004-01-22 2004 04 2004 04
friday 2004-01-23 2004 04 2004 04
saturday 2004-01-24 2004 04 2004 04
sunday 2004-01-25 2004 04 2004 04
monday 2004-01-26 2004 04 2004 05
tuesday 2004-01-27 2004 04 2004 05
wednesday 2004-01-28 2004 04 2004 05
thursday 2004-01-29 2004 05 2004 05
friday 2004-01-30 2004 05 2004 05
saturday 2004-01-31 2004 05 2004 05

42 rijen zijn geselecteerd.

rwijk@ORA11G> select to_char
2 ( date '2007-12-20' + level
3 , 'day yyyy-mm-dd yyyy ww iyyy iw'
4 )
5 from dual
6 connect by level <= 42
7 /

TO_CHAR(DATE'2007-12-20'+LEVEL,'DAYYYY
--------------------------------------
friday 2007-12-21 2007 51 2007 51
saturday 2007-12-22 2007 51 2007 51
sunday 2007-12-23 2007 51 2007 51
monday 2007-12-24 2007 52 2007 52
tuesday 2007-12-25 2007 52 2007 52
wednesday 2007-12-26 2007 52 2007 52
thursday 2007-12-27 2007 52 2007 52
friday 2007-12-28 2007 52 2007 52
saturday 2007-12-29 2007 52 2007 52
sunday 2007-12-30 2007 52 2007 52
monday 2007-12-31 2007 53 2008 01
tuesday 2008-01-01 2008 01 2008 01
wednesday 2008-01-02 2008 01 2008 01
thursday 2008-01-03 2008 01 2008 01
friday 2008-01-04 2008 01 2008 01
saturday 2008-01-05 2008 01 2008 01
sunday 2008-01-06 2008 01 2008 01
monday 2008-01-07 2008 01 2008 02
tuesday 2008-01-08 2008 02 2008 02
wednesday 2008-01-09 2008 02 2008 02
thursday 2008-01-10 2008 02 2008 02
friday 2008-01-11 2008 02 2008 02
saturday 2008-01-12 2008 02 2008 02
sunday 2008-01-13 2008 02 2008 02
monday 2008-01-14 2008 02 2008 03
tuesday 2008-01-15 2008 03 2008 03
wednesday 2008-01-16 2008 03 2008 03
thursday 2008-01-17 2008 03 2008 03
friday 2008-01-18 2008 03 2008 03
saturday 2008-01-19 2008 03 2008 03
sunday 2008-01-20 2008 03 2008 03
monday 2008-01-21 2008 03 2008 04
tuesday 2008-01-22 2008 04 2008 04
wednesday 2008-01-23 2008 04 2008 04
thursday 2008-01-24 2008 04 2008 04
friday 2008-01-25 2008 04 2008 04
saturday 2008-01-26 2008 04 2008 04
sunday 2008-01-27 2008 04 2008 04
monday 2008-01-28 2008 04 2008 05
tuesday 2008-01-29 2008 05 2008 05
wednesday 2008-01-30 2008 05 2008 05
thursday 2008-01-31 2008 05 2008 05

42 rijen zijn geselecteerd.


So week 1, according to the 'ww' date format element, always runs from January 1 until January 7. And in 2004, a week ran from Thursday until Wednesday and in 2008 it runs from Tuesday until Monday.

As said, this was not what I expected. Also, in 2007, where January 1 was a Monday, the behaviour described above was hidden.

Yet, the documentation is very clear about this subject, as can be seen by these quotes:

The week numbers returned by the WW format mask are calculated according to the following algorithm: int(dayOfYear+6)/7. This algorithm does not follow the ISO standard (2015, 1992-06-15).


WW No Week of year (1-53) where week 1 starts on the first day of the year and continues to the seventh day of the year.


Which again shows that in our business one should never assume, but always read the documentation and verify by testing.

Saturday, January 26, 2008

Sequence within parent

The sequence within parent column is the column in a detail table that, together with the foreign key column to the master table, constitutes a unique key of the detail table. And sometimes even the primary key if you are not into surrogate keys. A small example of a sequence within parent:

rwijk@ORA11G> create table orders
2 ( id number(10) not null primary key
3 , customer_id number(10) not null
4 , order_date date not null
5 )
6 /

Tabel is aangemaakt.

rwijk@ORA11G> create table order_lines
2 ( order_id number(10) not null references orders(id)
3 , sequence_within_parent number(6) not null
4 , product_id number(10) not null
5 , quantity number(6) not null
6 , price_per_piece number(10,2) not null
7 , primary key (order_id,sequence_within_parent)
8 )
9 /

Tabel is aangemaakt.


The purpose is to let the sequence within parent column start with the value 1 for each order_id and increase it by 1 for each subsequent order_line within the same order. In the days of Designer/2000 you could mark a column with a so-called "Autogen Type" as a sequence within parent, and the Oracle*Forms Generator of Designer would conveniently generate a Pre-Insert trigger at the block level containing a call to a function containing this piece of code:

DECLARE
CURSOR C IS
SELECT nvl(max(SEQUENCE_WITHIN_PARENT),0) + 1
FROM ORDER_LINES OLE
WHERE OLE.ORDER_ID = P_ORDER_ID;
BEGIN
OPEN C;
FETCH C INTO P_SEQUENCE_WITHIN_PARENT;
CLOSE C;
END;


But the problem with this code is that it won't function in a multi user environment, because if this code is executed in two different sessions for the same order_id, both sessions would get the same sequence_within_parent. One would succeed and the other would fail because of the unique index on (order_id,sequence_within_parent). Also, and probably more important, the sequence within parent makes it necessary to execute a select statement to be able to insert a new row. Which harms performance.

So why use the sequence within parent? Are you really going to query the third order_line for order_id 1? I doubt it. In my experience, the sequence within parent is often lightheartedly introduced by the designer "just to be able to have an easy unique identifier". The only case I can see where this column is useful, is when it is important to be able to quickly search for the n-th order_line, and when the total number of detail records within one master record can be a big number. And as said, this case is very rare.

Suppose the tables above are filled like this:

rwijk@ORA11G> insert into orders
2 select 1, 1234, date '2008-01-01' from dual union all
3 select 2, 5678, date '2008-01-20' from dual union all
4 select 3, 1234, date '2008-02-01' from dual
5 /

3 rijen zijn aangemaakt.

rwijk@ORA11G> insert into order_lines
2 select 1, 1, 987, 10, 25 from dual union all
3 select 1, 2, 876, 1, 500 from dual union all
4 select 1, 3, 765, 50, 2 from dual union all
5 select 2, 1, 654, 1, 750 from dual union all
6 select 3, 1, 99, 1, 22.95 from dual union all
7 select 3, 2, 98, 1, 67.99 from dual union all
8 select 3, 3, 97, 5, 9.99 from dual union all
9 select 3, 4, 96, 1, 19.99 from dual union all
10 select 3, 5, 95, 10, 24.95 from dual
11 /

9 rijen zijn aangemaakt.


An insert, placed in an api of course, would be something like this:

rwijk@ORA11G> insert into order_lines
2 ( order_id
3 , sequence_within_parent
4 , product_id
5 , quantity
6 , price_per_piece
7 )
8 values
9 ( 3
10 , (select nvl(max(sequence_within_parent),0)+1 from order_lines where order_id=3)
11 , 90
12 , 1
13 , 4.99
14 )
15 /

1 rij is aangemaakt.

rwijk@ORA11G> select *
2 from order_lines
3 where order_id = 3
4 /

ORDER_ID SEQUENCE_WITHIN_PARENT PRODUCT_ID QUANTITY PRICE_PER_PIECE
---------- ---------------------- ---------- ---------- ---------------
3 1 99 1 22,95
3 2 98 1 67,99
3 3 97 5 9,99
3 4 96 1 19,99
3 5 95 10 24,95
3 6 90 1 4,99

6 rijen zijn geselecteerd.


The alternative is to use a sequence based id column and have a unique key - the unique identifier in ERD terms - on (order_id,product_id). Like this:

rwijk@ORA11G> drop table order_lines
2 /

Tabel is verwijderd.

rwijk@ORA11G> create table order_lines
2 ( id number(10) not null primary key
3 , order_id number(10) not null references orders(id)
4 , product_id number(10) not null
5 , quantity number(6) not null
6 , price_per_piece number(10,2) not null
7 , unique (order_id,product_id)
8 )
9 /

Tabel is aangemaakt.

rwijk@ORA11G> insert into order_lines
2 select 1, 1, 987, 10, 25 from dual union all
3 select 2, 1, 876, 1, 500 from dual union all
4 select 3, 1, 765, 50, 2 from dual union all
5 select 4, 2, 654, 1, 750 from dual union all
6 select 5, 3, 99, 1, 22.95 from dual union all
7 select 6, 3, 98, 1, 67.99 from dual union all
8 select 7, 3, 97, 5, 9.99 from dual union all
9 select 8, 3, 96, 1, 19.99 from dual union all
10 select 9, 3, 95, 10, 24.95 from dual
11 /

9 rijen zijn aangemaakt.

rwijk@ORA11G> create sequence order_lines_seq start with 10 increment by 1
2 /

Reeks is aangemaakt.


The insert now becomes much easier:

rwijk@ORA11G> insert into order_lines
2 ( id
3 , order_id
4 , product_id
5 , quantity
6 , price_per_piece
7 )
8 values
9 ( order_lines_seq.nextval
10 , 3
11 , 90
12 , 1
13 , 4.99
14 )
15 /

1 rij is aangemaakt.


It is simple, scalable and always works in a multi-user environment.
And in case the end user, REALLY likes to see a "sequence within parent" column in his gui, then you just define a view like this:

rwijk@ORA11G> create view v_order_lines
2 as
3 select order_id
4 , row_number() over (partition by order_id order by id) sequence_within_parent
5 , product_id
6 , quantity
7 , price_per_piece
8 from order_lines
9 /

View is aangemaakt.

rwijk@ORA11G> select *
2 from v_order_lines
3 where order_id = 3
4 /

ORDER_ID SEQUENCE_WITHIN_PARENT PRODUCT_ID QUANTITY PRICE_PER_PIECE
---------- ---------------------- ---------- ---------- ---------------
3 1 99 1 22,95
3 2 98 1 67,99
3 3 97 5 9,99
3 4 96 1 19,99
3 5 95 10 24,95
3 6 90 1 4,99

6 rijen zijn geselecteerd.

Thursday, January 17, 2008

Flexible attributes and performance

The client I currently work for has quite a few special tables containing flexible attributes. These tables have one foreign key to another table, the base table. The attributes in the table containing the flexibal attributes conceptually belong to the base table, but their attribute values can change over time - the table contains a startdate and an enddate. And it is important that one can see the value of those attributes at any given time.

Today I was working on optimizing a large query that involved (among many others...) a base table of 4M rows, and a flexible attributes table of 40M rows containing approximately 20 flexible attributes. The query needed the current value of 7 of those flexible attributes. I did some testing about what would be the best approach to query all those values. And because I think it is a case that is certainly not rare, it is now here on my blog.

Here is a test case to simulate the situation:

rwijk@ORA11G> create table base (id, description)
2 as
3 select level
4 , 'description' || to_char(level)
5 from dual
6 connect by level <= 10000
7 /

Tabel is aangemaakt.

rwijk@ORA11G> create table flexattributes_of_base
2 (id,attribute,base_id,value,startdate,enddate)
3 as
4 select level
5 , 'attribute' || to_char(mod(level-1,20))
6 , ceil(level/40)
7 , to_char(trunc(dbms_random.value(1,11)))
8 , case trunc(mod(level-1,40)/20)
9 when 0 then date '2007-01-01'
10 else date '2008-01-01'
11 end
12 , case trunc(mod(level-1,40)/20)
13 when 0 then date '2008-01-01'
14 end
15 from dual
16 connect by level <= 400000
17 /

Tabel is aangemaakt.

rwijk@ORA11G> remark
rwijk@ORA11G> remark Insert attribute20 with only non-active values
rwijk@ORA11G> remark
rwijk@ORA11G> insert into flexattributes_of_base
2 select 400000 + level
3 , 'attribute20'
4 , level
5 , 'a'
6 , date '2007-01-01'
7 , date '2008-01-01'
8 from dual
9 connect by level <= 10000
10 /

10000 rijen zijn aangemaakt.

rwijk@ORA11G> remark
rwijk@ORA11G> remark Insert attribute21 with more than one active value per base_id,
rwijk@ORA11G> remark which is garbage of course,
rwijk@ORA11G> remark
rwijk@ORA11G> insert into flexattributes_of_base
2 select 410000 + level
3 , 'attribute21'
4 , mod(level-1,10000) + 1
5 , 'b'
6 , case
7 when level <= 10000 then date '2007-01-01'
8 else date '2008-01-01'
9 end
10 , null
11 from dual
12 connect by level <= 20000
13 /

20000 rijen zijn aangemaakt.

rwijk@ORA11G> remark
rwijk@ORA11G> remark Insert attribute22 only for base_id 1
rwijk@ORA11G> remark
rwijk@ORA11G> insert into flexattributes_of_base
2 values
3 ( 430001
4 , 'attribute22'
5 , 1
6 , 'c'
7 , date '2007-01-01'
8 , null
9 )
10 /

1 rij is aangemaakt.

rwijk@ORA11G> alter table base add constraint base_pk primary key (id)
2 /

Tabel is gewijzigd.

rwijk@ORA11G> alter table flexattributes_of_base
2 add constraint flex_pk primary key (id)
3 /

Tabel is gewijzigd.

rwijk@ORA11G> alter table flexattributes_of_base
2 add constraint flex_uk1 unique (attribute,base_id,startdate)
3 /

Tabel is gewijzigd.

rwijk@ORA11G> exec dbms_stats.gather_table_stats(user,'base',cascade=>true)

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> begin
2 dbms_stats.gather_table_stats
3 ( user
4 , 'flexattributes_of_base'
5 , cascade=>true
6 , method_opt=>'FOR ALL INDEXED COLUMNS'
7 );
8 end;
9 /

PL/SQL-procedure is geslaagd.


So in this testcase, the base table contains 10,000 rows, and the flex_attributes_of_base table contains 430,001 rows. There are 23 flexible attributes ('attribute0' until 'attribute22'). Every record in the base table has two records for each of the first 20 attributes (0-19). One for 2007-01-01 until 2008-01-01 and one for 2008-01-01 and upwards. The next three attributes are slightly different - you may even call it buggy - just to see if the queries output the right data in case:

- an attribute does not have a current value because earlier records all have an enddate (attribute20)

- an attribute has more than one current value, in which case the query should display the one with the most recent startdate (attribute21)

- an attribute that only base_id 1 has a value for (attribute22)

Let's issue a very simple but typical query against the two tables to retrieve the base record with id 1234 and its current value for attribute 11:

rwijk@ORA11G> select b.id
2 , b.description
3 , f11.value
4 from base b
5 , flexattributes_of_base f11
6 where b.id = 1234
7 and b.id = f11.base_id (+)
8 and f11.attribute (+) = 'attribute11'
9 and f11.enddate is null
10 and not exists
11 ( select 'more recent startdate'
12 from flexattributes_of_base f11_2
13 where f11_2.attribute = f11.attribute
14 and f11_2.base_id = f11.base_id
15 and f11_2.startdate > f11.startdate
16 )
17 /

ID DESCRIPTION VALUE
-------- --------------------------------------------------- -----
1234 description1234 3

1 rij is geselecteerd.


Note that the exists-subquery is necessary in case of attribute21, when more than one row exists with an empty enddate.

And if you also want the current value of attribute22, you just repeat the same pattern:

rwijk@ORA11G> select b.id
2 , b.description
3 , f11.value
4 , f22.value
5 from base b
6 , flexattributes_of_base f11
7 , flexattributes_of_base f22
8 where b.id = 1234
9 and b.id = f11.base_id (+)
10 and b.id = f22.base_id (+)
11 and f11.attribute (+) = 'attribute11'
12 and f22.attribute (+) = 'attribute22'
13 and f11.enddate is null
14 and f22.enddate is null
15 and not exists
16 ( select 'more recent startdate'
17 from flexattributes_of_base f11_2
18 where f11_2.attribute = f11.attribute
19 and f11_2.base_id = f11.base_id
20 and f11_2.startdate > f11.startdate
21 )
22 and not exists
23 ( select 'more recent startdate'
24 from flexattributes_of_base f22_2
25 where f22_2.attribute = f22.attribute
26 and f22_2.base_id = f22.base_id
27 and f22_2.startdate > f22.startdate
28 )
29 /

ID DESCRIPTION VALUE VALUE
-------- --------------------------------------------------- ----- -----
1234 description1234 3

1 rij is geselecteerd.


As said these queries are quite standard, so when they had to query the entire base table with 7 attributes (i'll use 5 below), they did it like this initially:

rwijk@ORA11G> select b.id
2 , f19.value f19_value
3 , f20.value f20_value
4 , f21.value f21_value
5 , f22.value f22_value
6 , f23.value f23_value
7 from base b
8 , flexattributes_of_base f19
9 , flexattributes_of_base f20
10 , flexattributes_of_base f21
11 , flexattributes_of_base f22
12 , flexattributes_of_base f23
13 where b.id = f19.base_id (+)
14 and b.id = f20.base_id (+)
15 and b.id = f21.base_id (+)
16 and b.id = f22.base_id (+)
17 and b.id = f23.base_id (+)
18 and f19.attribute (+) = 'attribute19'
19 and f20.attribute (+) = 'attribute20'
20 and f21.attribute (+) = 'attribute21'
21 and f22.attribute (+) = 'attribute22'
22 and f23.attribute (+) = 'attribute23'
23 and f19.enddate (+) is null
24 and f20.enddate (+) is null
25 and f21.enddate (+) is null
26 and f22.enddate (+) is null
27 and f23.enddate (+) is null
28 and not exists
29 ( select 'more recent startdate'
30 from flexattributes_of_base f19_2
31 where f19_2.attribute = f19.attribute
32 and f19_2.base_id = f19.base_id
33 and f19_2.startdate > f19.startdate
34 )
35 and not exists
36 ( select 'more recent startdate'
37 from flexattributes_of_base f20_2
38 where f20_2.attribute = f20.attribute
39 and f20_2.base_id = f20.base_id
40 and f20_2.startdate > f20.startdate
41 )
42 and not exists
43 ( select 'more recent startdate'
44 from flexattributes_of_base f21_2
45 where f21_2.attribute = f21.attribute
46 and f21_2.base_id = f21.base_id
47 and f21_2.startdate > f21.startdate
48 )
49 and not exists
50 ( select 'more recent startdate'
51 from flexattributes_of_base f22_2
52 where f22_2.attribute = f22.attribute
53 and f22_2.base_id = f22.base_id
54 and f22_2.startdate > f22.startdate
55 )
56 and not exists
57 ( select 'more recent startdate'
58 from flexattributes_of_base f23_2
59 where f23_2.attribute = f23.attribute
60 and f23_2.base_id = f23.base_id
61 and f23_2.startdate > f23.startdate
62 )
63 order by b.id
64 /


But here the same table/index is accessed 10 times. One developer noticed that and wrote another query to address this problem. He was able to get rid of the exists-subquery using the row_number analytic function, like this:

rwijk@ORA11G> select b.id
2 , f19.value f19_value
3 , f20.value f20_value
4 , f21.value f21_value
5 , f22.value f22_value
6 , f23.value f23_value
7 from base b
8 , ( select base_id
9 , value
10 , row_number() over (partition by base_id order by startdate desc) rn
11 from flexattributes_of_base
12 where attribute = 'attribute19'
13 and enddate is null
14 ) f19
15 , ( select base_id
16 , value
17 , row_number() over (partition by base_id order by startdate desc) rn
18 from flexattributes_of_base
19 where attribute = 'attribute20'
20 and enddate is null
21 ) f20
22 , ( select base_id
23 , value
24 , row_number() over (partition by base_id order by startdate desc) rn
25 from flexattributes_of_base
26 where attribute = 'attribute21'
27 and enddate is null
28 ) f21
29 , ( select base_id
30 , value
31 , row_number() over (partition by base_id order by startdate desc) rn
32 from flexattributes_of_base
33 where attribute = 'attribute22'
34 and enddate is null
35 ) f22
36 , ( select base_id
37 , value
38 , enddate
39 , row_number() over (partition by base_id order by startdate desc) rn
40 from flexattributes_of_base
41 where attribute = 'attribute23'
42 ) f23
43 where b.id = f19.base_id (+)
44 and b.id = f20.base_id (+)
45 and b.id = f21.base_id (+)
46 and b.id = f22.base_id (+)
47 and b.id = f23.base_id (+)
48 and (f19.rn = 1 or f19.rn is null)
49 and (f20.rn = 1 or f20.rn is null)
50 and (f21.rn = 1 or f21.rn is null)
51 and (f22.rn = 1 or f22.rn is null)
52 and (f23.rn = 1 or f23.rn is null)
53 order by b.id
54 /


This query is a bit better, but still accesses the table 5 times. The number of table scans was reduced to only one, by using a technique that resembles the well known pivot trick:

rwijk@ORA11G> select b.id
2 , f.f19_value
3 , f.f20_value
4 , f.f21_value
5 , f.f22_value
6 , f.f23_value
7 from base b
8 , ( select base_id
9 , max(decode(attribute,'attribute19',value))
10 keep
11 (dense_rank last
12 order by decode(attribute,'attribute19',startdate) nulls first
13 ) f19_value
14 , max(decode(attribute,'attribute20',value))
15 keep
16 (dense_rank last
17 order by decode(attribute,'attribute20',startdate) nulls first
18 ) f20_value
19 , max(decode(attribute,'attribute21',value))
20 keep
21 (dense_rank last
22 order by decode(attribute,'attribute21',startdate) nulls first
23 ) f21_value
24 , max(decode(attribute,'attribute22',value))
25 keep
26 (dense_rank last
27 order by decode(attribute,'attribute22',startdate) nulls first
28 ) f22_value
29 , max(decode(attribute,'attribute23',value))
30 keep
31 (dense_rank last
32 order by decode(attribute,'attribute23',startdate) nulls first
33 ) f23_value
34 from flexattributes_of_base
35 where attribute in ('attribute19','attribute20','attribute21','attribute22','attribute23')
36 and enddate is null
37 group by base_id
38 ) f
39 where b.id = f.base_id (+)
40 order by b.id
41 /


The tkprof output of the three queries:

Query1:

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 1.76 1.77 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 668 1.85 5.95 9622 39430 0 10000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 670 3.62 7.72 9622 39430 0 10000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 81

Rows Row Source Operation
------- ---------------------------------------------------
10000 SORT ORDER BY (cr=39430 pr=9622 pw=9622 time=218 us cost=9881 size=2274972 card=7738)
10000 HASH JOIN ANTI (cr=39430 pr=9622 pw=9622 time=217 us cost=9389 size=2274972 card=7738)
20000 HASH JOIN RIGHT OUTER (cr=37401 pr=9622 pw=9622 time=1172 us cost=7990 size=2199150 card=8145)
20000 TABLE ACCESS FULL FLEXATTRIBUTES_OF_BASE (cr=2392 pr=72 pw=72 time=361 us cost=669 size=35972 card=1058)
10000 HASH JOIN ANTI (cr=35009 pr=9550 pw=9550 time=198 us cost=7320 size=1922220 card=8145)
10000 HASH JOIN RIGHT OUTER (cr=32980 pr=9550 pw=9550 time=870 us cost=5939 size=1817688 card=8574)
10000 TABLE ACCESS FULL FLEXATTRIBUTES_OF_BASE (cr=2389 pr=2386 pw=2386 time=669 us cost=669 size=34374 card=1011)
10000 HASH JOIN ANTI (cr=30591 pr=7164 pw=7164 time=189 us cost=5270 size=1526172 card=8574)
10000 HASH JOIN RIGHT OUTER (cr=28562 pr=7164 pw=7164 time=830 us cost=3909 size=1389850 card=9025)
0 TABLE ACCESS FULL FLEXATTRIBUTES_OF_BASE (cr=2389 pr=2386 pw=2386 time=0 us cost=669 size=18054 card=531)
10000 HASH JOIN ANTI (cr=26173 pr=4778 pw=4778 time=180 us cost=3239 size=1083000 card=9025)
10000 HASH JOIN RIGHT OUTER (cr=24144 pr=4777 pw=4777 time=830 us cost=1900 size=912000 card=9500)
0 TABLE ACCESS FULL FLEXATTRIBUTES_OF_BASE (cr=2389 pr=2386 pw=2386 time=0 us cost=669 size=9010 card=265)
10000 HASH JOIN ANTI (cr=21755 pr=2391 pw=2391 time=175 us cost=1231 size=589000 card=9500)
10000 HASH JOIN RIGHT OUTER (cr=2414 pr=2386 pw=2386 time=979 us cost=676 size=380000 card=10000)
1 TABLE ACCESS FULL FLEXATTRIBUTES_OF_BASE (cr=2389 pr=2386 pw=2386 time=0 us cost=669 size=9010 card=265)
10000 INDEX FAST FULL SCAN BASE_PK (cr=25 pr=0 pw=0 time=199 us cost=7 size=40000 card=10000)(object id 72460)
430001 INDEX FAST FULL SCAN FLEX_UK1 (cr=19341 pr=5 pw=5 time=7911 us cost=551 size=10320024 card=430001)(object id 72462)
430001 INDEX FAST FULL SCAN FLEX_UK1 (cr=2029 pr=1 pw=1 time=7335 us cost=551 size=10320024 card=430001)(object id 72462)
430001 INDEX FAST FULL SCAN FLEX_UK1 (cr=2029 pr=0 pw=0 time=7321 us cost=551 size=10320024 card=430001)(object id 72462)
430001 INDEX FAST FULL SCAN FLEX_UK1 (cr=2029 pr=0 pw=0 time=7338 us cost=551 size=10320024 card=430001)(object id 72462)
430001 INDEX FAST FULL SCAN FLEX_UK1 (cr=2029 pr=0 pw=0 time=7420 us cost=551 size=10320024 card=430001)(object id 72462)


Query2:

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.32 0.34 0 0 0 0
Execute 1 0.01 0.00 0 0 0 0
Fetch 668 2.15 2.21 2 105894 0 10000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 670 2.50 2.55 2 105894 0 10000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 81

Rows Row Source Operation
------- ---------------------------------------------------
10000 SORT ORDER BY (cr=105894 pr=2 pw=2 time=200 us cost=1588 size=211 card=1)
10000 FILTER (cr=105894 pr=2 pw=2 time=19897 us)
20000 NESTED LOOPS OUTER (cr=105894 pr=2 pw=2 time=34524 us cost=1587 size=211 card=1)
10000 FILTER (cr=65792 pr=2 pw=2 time=12472 us)
10000 NESTED LOOPS OUTER (cr=65792 pr=2 pw=2 time=11293 us cost=1582 size=172 card=1)
10000 FILTER (cr=34710 pr=2 pw=2 time=5387 us)
10000 NESTED LOOPS OUTER (cr=34710 pr=2 pw=2 time=4959 us cost=1572 size=266 card=2)
10000 FILTER (cr=4809 pr=2 pw=2 time=570 us)
10000 HASH JOIN OUTER (cr=4809 pr=2 pw=2 time=210 us cost=1397 size=3290 card=35)
10000 FILTER (cr=2417 pr=2 pw=2 time=1142 us)
10000 HASH JOIN RIGHT OUTER (cr=2417 pr=2 pw=2 time=821 us cost=677 size=32725 card=595)
1 VIEW (cr=2392 pr=2 pw=2 time=0 us cost=670 size=10335 card=265)
1 WINDOW SORT (cr=2392 pr=2 pw=2 time=0 us cost=670 size=9010 card=265)
1 TABLE ACCESS FULL FLEXATTRIBUTES_OF_BASE (cr=2392 pr=2 pw=2 time=0 us cost=669 size=9010 card=265)
10000 INDEX FAST FULL SCAN BASE_PK (cr=25 pr=0 pw=0 time=179 us cost=7 size=160000 card=10000)(object id 72460)
0 VIEW (cr=2392 pr=0 pw=0 time=0 us cost=719 size=207012 card=5308)
0 WINDOW SORT (cr=2392 pr=0 pw=0 time=0 us cost=719 size=180472 card=5308)
0 TABLE ACCESS FULL FLEXATTRIBUTES_OF_BASE (cr=2392 pr=0 pw=0 time=0 us cost=667 size=180472 card=5308)
0 VIEW PUSHED PREDICATE (cr=29901 pr=0 pw=0 time=0 us cost=5 size=39 card=1)
0 WINDOW SORT (cr=29901 pr=0 pw=0 time=0 us cost=5 size=34 card=1)
0 TABLE ACCESS BY INDEX ROWID FLEXATTRIBUTES_OF_BASE (cr=29901 pr=0 pw=0 time=0 us cost=4 size=34 card=1)
10000 INDEX RANGE SCAN FLEX_UK1 (cr=20053 pr=0 pw=0 time=0 us cost=3 size=0 card=1)(object id 72462)
10000 VIEW PUSHED PREDICATE (cr=31082 pr=0 pw=0 time=0 us cost=5 size=39 card=1)
10000 WINDOW SORT (cr=31082 pr=0 pw=0 time=0 us cost=5 size=34 card=1)
10000 TABLE ACCESS BY INDEX ROWID FLEXATTRIBUTES_OF_BASE (cr=31082 pr=0 pw=0 time=0 us cost=4 size=34 card=1)
20000 INDEX RANGE SCAN FLEX_UK1 (cr=20107 pr=0 pw=0 time=25499 us cost=3 size=0 card=2)(object id 72462)
20000 VIEW PUSHED PREDICATE (cr=40102 pr=0 pw=0 time=75748 us cost=5 size=39 card=1)
20000 WINDOW SORT (cr=40102 pr=0 pw=0 time=34334 us cost=5 size=34 card=1)
20000 TABLE ACCESS BY INDEX ROWID FLEXATTRIBUTES_OF_BASE (cr=40102 pr=0 pw=0 time=93885 us cost=4 size=34 card=1)
20000 INDEX RANGE SCAN FLEX_UK1 (cr=20103 pr=0 pw=0 time=26785 us cost=3 size=0 card=2)(object id 72462)


Query 3:

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 668 0.42 0.41 41 3076 0 10000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 670 0.43 0.41 41 3076 0 10000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 81

Rows Row Source Operation
------- ---------------------------------------------------
10000 MERGE JOIN OUTER (cr=3076 pr=41 pw=41 time=1922 us cost=763 size=1180000 card=10000)
10000 INDEX FULL SCAN BASE_PK (cr=684 pr=0 pw=0 time=222 us cost=21 size=40000 card=10000)(object id 72460)
10000 SORT JOIN (cr=2392 pr=41 pw=41 time=0 us cost=742 size=307344 card=2696)
10000 VIEW (cr=2392 pr=41 pw=41 time=553 us cost=670 size=307344 card=2696)
10000 SORT GROUP BY (cr=2392 pr=41 pw=41 time=225 us cost=670 size=91664 card=2696)
30001 TABLE ACCESS FULL FLEXATTRIBUTES_OF_BASE (cr=2392 pr=41 pw=41 time=1291 us cost=669 size=106420 card=3130)


The query runtime was drastically reduced by this trick. Not exactly rocket science, but very effective.

Wednesday, January 9, 2008

8 things about me

This tagging thing has reached me as well. I saw it coming to me via Eddie Awad, who tagged Laurent Schneider, who tagged Chen Shapira, who tagged me. So now it's my turn. Although my initial intention with this blog was that it would only be "About Oracle", I'll make an exception now.

So here are 8 random things about me:

1) I was born in december 1972 and have one younger sister.

2) My acting career started and ended with a performance at elementary school where I played a tree.

3) I love listening to music, especially from bands like Sigur Ros, Pink Floyd, Mogwai, Radiohead, Pixies, Godspeed You! Black Emperor, Dead Can Dance, Joy Division and many more.

4) It keeps surprising me that advertising pays off. Personally I tend to stop buying products after seeing some annoying ad. And why spamming apparently works is really beyond me.

5) I love to watch the Tour de France, the biggest road cycling event, despite all the fuzz about dope these days. I organize a little betting pool with family, friends and colleagues. The calculation and administration of this is, of course, done with Oracle and a handful of SQL*Plus scripts. Each year I promise myself to make it easier for the participants by setting up a little ApEx site, but each year I start too late and end up unfinished.

6) I have played chess from the age of 19 up until 3 years ago, and still do occasionally on the internet. I like to play gambits, especially the Budapester and King's Gambit. It led me to one local championship of Gouda, but only because the strongest player failed to play enough games that year.

7) I've been to the USA in 1998, where I drove around in a rented car through Florida for one week. Ten years later I hope to visit the USA again for Oracle Open World in San Francisco.

8) I'm married and have two sons of 2 years old and almost 1 year old.

And now it's my turn to tag (in random order):
Tonguc Yilmaz,
Volder,
Ronald Rood,
Nicolas Gasparotto,
cd and
Adrian Billington.

Thursday, January 3, 2008

One oddity explained

A few weeks ago I wrote about four oddities I encountered when doing a little research into several 11g features in this blog post. And this morning I read a very thorough article about the result cache here on oracle-developer.net, which addresses the first oddity by this sentence:

The cache is allocated directly from the shared pool but is maintained separately.


Although Adrian does not state how he came to know this, it sounds as a simple but plausible explanation of why the Concepts Manual shows the result cache as an area separate from the shared pool and why the dbms_result_cache.memory_report tells us it is memory from the shared pool.

At the end of the article, three links to Pythian blog posts are shown. In short, they are a warning that the result cache may not be as scalable as one might hope. Because I enjoyed reading those posts as well, here are the links: post 1, post 2 and post 3.