Saturday, July 19, 2008

Scalability of dbms_lock.request

For some time now, I have recommended to use dbms_lock.request when implementing business rules correctly. In short, the idea behind it is to serialize access when validating some entity rule when the rows belong to the same parent record. For example, think of a business rule that checks for an overlap of periods. An alternative for dbms_lock.request is to physically lock the parent row. But sometimes there is no real parent row to lock, so you can resort to the dbms_lock.request procedure that allows you to lock a number. Your only remaining challenge then is to determine which rows should map to the same number.

A few weeks ago, a colleague was implementing my suggestion and noted performance problems when doing a single update statement that updated a lot of rows. The performance problems disappeared when commenting out the lock procedure. I have used this method several times now and never experienced problems with it. Probably because I never processed real large sets with it. Also, when doing row-by-row processing on a large set, the process itself is slow and the fact that dbms_lock does not scale, is masked by the inefficiency of row-by-row-processing. So, to show what I mean, I will process a large set with a single SQL statement. For each row, a trigger locks based on the parent id. First things first, here are the tables:

rwijk@ORA11G> create table parent (id,name)
2 as
3 select level
4 , 'name' || to_char(level)
5 from dual
6 connect by level <= 100000
7 /

Tabel is aangemaakt.

rwijk@ORA11G> alter table parent add primary key (id)
2 /

Tabel is gewijzigd.

rwijk@ORA11G> create table child (id,parent_id,startdate,enddate)
2 as
3 select level
4 , ceil(level/10)
5 , date '1999-01-01' + numtoyminterval(mod(level,10),'year')
6 , date '1999-12-31' + numtoyminterval(mod(level,10),'year')
7 from dual
8 connect by level <= 1000000
9 /

Tabel is aangemaakt.

rwijk@ORA11G> alter table child add primary key (id)
2 /

Tabel is gewijzigd.

rwijk@ORA11G> alter table child add foreign key (parent_id) references parent(id)
2 /

Tabel is gewijzigd.

rwijk@ORA11G> create index i1 on child(parent_id)
2 /

Index is aangemaakt.

rwijk@ORA11G> exec dbms_stats.gather_table_stats(user,'parent',cascade=>true)

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> exec dbms_stats.gather_table_stats(user,'child',cascade=>true)

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> select * from child where id between 1 and 12 order by id
2 /

ID PARENT_ID STARTDATE ENDDATE
---------- ---------- ------------------- -------------------
1 1 01-01-2000 00:00:00 31-12-2000 00:00:00
2 1 01-01-2001 00:00:00 31-12-2001 00:00:00
3 1 01-01-2002 00:00:00 31-12-2002 00:00:00
4 1 01-01-2003 00:00:00 31-12-2003 00:00:00
5 1 01-01-2004 00:00:00 31-12-2004 00:00:00
6 1 01-01-2005 00:00:00 31-12-2005 00:00:00
7 1 01-01-2006 00:00:00 31-12-2006 00:00:00
8 1 01-01-2007 00:00:00 31-12-2007 00:00:00
9 1 01-01-2008 00:00:00 31-12-2008 00:00:00
10 1 01-01-1999 00:00:00 31-12-1999 00:00:00
11 2 01-01-2000 00:00:00 31-12-2000 00:00:00
12 2 01-01-2001 00:00:00 31-12-2001 00:00:00

12 rijen zijn geselecteerd.

So 100,000 parent records, each with 10 child records resulting in a total of 1,000,000 child records. Next, create a package with two variants of serializing at the parent level, one by using dbms_lock.request and the other by the parent record. The package also includes a procedure that prints the elapsed time and some text:

rwijk@ORA11G> create package mypkg
2 as
3 procedure lck (p_parent_id in child.parent_id%type);
4 procedure lck_real_parent (p_parent_id in child.parent_id%type);
5 procedure print_time_and_text (p_text in varchar2);
6 procedure reset_timestamp;
7 end;
8 /

Package is aangemaakt.

rwijk@ORA11G> create package body mypkg
2 as
3 l_previous_timestamp timestamp
4 ;
5 procedure lck (p_parent_id in child.parent_id%type)
6 is
7 l_lock_dummy integer;
8 begin
9 l_lock_dummy := dbms_lock.request
10 ( release_on_commit => true
11 , id => dbms_utility.get_hash_value
12 ( name => 'child' || to_char(p_parent_id)
13 , base => 1
14 , hash_size => power(2,30)
15 )
16 );
17 end lck
18 ;
19 procedure lck_real_parent (p_parent_id in child.parent_id%type)
20 is
21 l_dummy parent.id%type;
22 begin
23 select id
24 into l_dummy
25 from parent
26 where id = p_parent_id
27 for update of id
28 ;
29 end lck_real_parent
30 ;
31 procedure print_time_and_text (p_text in varchar2)
32 is
33 cn_new_timestamp timestamp := systimestamp;
34 begin
35 dbms_output.put_line
36 ( to_char(cn_new_timestamp,'hh24:mi:ss.ff') ||
37 ': ' ||
38 p_text || ' ' ||
39 (cn_new_timestamp - l_previous_timestamp)
40 );
41 l_previous_timestamp := cn_new_timestamp
42 ;
43 end print_time_and_text
44 ;
45 procedure reset_timestamp
46 is
47 begin
48 l_previous_timestamp := null;
49 end reset_timestamp
50 ;
51 end mypkg;
52 /

Package-body is aangemaakt.

Now I will run three variants:
1) without locking, to know how much time the statement itself takes
2) including locking of the parent, implemented with a dbms_lock.request
3) including locking of the parent, by locking the real parent record
For each variant one single update statement updates 1,000,000 rows. The elapsed time is reported after each 50,000 records.

rwijk@ORA11G> create trigger mytrg
2 before update on child
3 for each row
4 begin
5 if mod(:new.id,50000) = 0
6 then
7 mypkg.print_time_and_text(lpad(:new.id,7));
8 end if;
9 end;
10 /

Trigger is aangemaakt.

rwijk@ORA11G> begin
2 mypkg.print_time_and_text(' start');
3 update child
4 set enddate = enddate - 1
5 ;
6 end;
7 /
06:42:33.671000000: start
06:42:35.843000000: 50000 +000000000 00:00:02.172000000
06:42:38.671000000: 100000 +000000000 00:00:02.828000000
06:42:40.828000000: 150000 +000000000 00:00:02.157000000
06:42:43.531000000: 200000 +000000000 00:00:02.703000000
06:42:46.656000000: 250000 +000000000 00:00:03.125000000
06:42:49.531000000: 300000 +000000000 00:00:02.875000000
06:42:51.609000000: 350000 +000000000 00:00:02.078000000
06:42:54.437000000: 400000 +000000000 00:00:02.828000000
06:42:56.796000000: 450000 +000000000 00:00:02.359000000
06:42:59.453000000: 500000 +000000000 00:00:02.657000000
06:43:02.031000000: 550000 +000000000 00:00:02.578000000
06:43:05.109000000: 600000 +000000000 00:00:03.078000000
06:43:07.703000000: 650000 +000000000 00:00:02.594000000
06:43:10.625000000: 700000 +000000000 00:00:02.922000000
06:43:13.203000000: 750000 +000000000 00:00:02.578000000
06:43:15.781000000: 800000 +000000000 00:00:02.578000000
06:43:18.234000000: 850000 +000000000 00:00:02.453000000
06:43:21.156000000: 900000 +000000000 00:00:02.922000000
06:43:23.953000000: 950000 +000000000 00:00:02.797000000
06:43:26.703000000: 1000000 +000000000 00:00:02.750000000

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> rollback
2 /

Rollback is voltooid.

So updating 1,000,000 rows like this takes between 2,0 seconds and 3,1 seconds. Let's round it to 2,5 seconds per 50,000 rows. Now let's include the lck procedure with dbms_lock.request:

rwijk@ORA11G> exec mypkg.reset_timestamp

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> create or replace trigger mytrg
2 before update on child
3 for each row
4 begin
5 mypkg.lck(:new.parent_id)
6 ;
7 if mod(:new.id,50000) = 0
8 then
9 mypkg.print_time_and_text(lpad(:new.id,7));
10 end if;
11 end;
12 /

Trigger is aangemaakt.

rwijk@ORA11G> begin
2 mypkg.print_time_and_text(' start');
3 update child
4 set enddate = enddate - 1
5 ;
6 end;
7 /
06:45:12.156000000: start
06:45:15.375000000: 50000 +000000000 00:00:03.219000000
06:45:21.296000000: 100000 +000000000 00:00:05.921000000
06:45:31.375000000: 150000 +000000000 00:00:10.079000000
06:45:45.093000000: 200000 +000000000 00:00:13.718000000
06:46:01.515000000: 250000 +000000000 00:00:16.422000000
06:46:21.234000000: 300000 +000000000 00:00:19.719000000
06:46:43.609000000: 350000 +000000000 00:00:22.375000000
06:47:08.796000000: 400000 +000000000 00:00:25.187000000
06:47:36.531000000: 450000 +000000000 00:00:27.735000000
06:48:06.828000000: 500000 +000000000 00:00:30.297000000
06:48:40.546000000: 550000 +000000000 00:00:33.718000000
06:49:16.875000000: 600000 +000000000 00:00:36.329000000
06:49:56.328000000: 650000 +000000000 00:00:39.453000000
06:50:38.109000000: 700000 +000000000 00:00:41.781000000
06:51:23.734000000: 750000 +000000000 00:00:45.625000000
06:52:11.093000000: 800000 +000000000 00:00:47.359000000
06:53:02.765000000: 850000 +000000000 00:00:51.672000000
06:53:56.796000000: 900000 +000000000 00:00:54.031000000
06:54:52.265000000: 950000 +000000000 00:00:55.469000000
06:55:50.218000000: 1000000 +000000000 00:00:57.953000000

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> rollback
2 /

Rollback is voltooid.

A linear slowdown can be seen here. It start off pretty efficient, but for every 100,000 rows there is approximately 6 seconds slowdown. The last 50,000 rows take 1 minute!

Here is the last variant which uses the lck_real_parent procedure:

rwijk@ORA11G> exec mypkg.reset_timestamp

PL/SQL-procedure is geslaagd.

rwijk@ORA11G> create or replace trigger mytrg
2 before update on child
3 for each row
4 begin
5 mypkg.lck_real_parent(:new.parent_id)
6 ;
7 if mod(:new.id,50000) = 0
8 then
9 mypkg.print_time_and_text(lpad(:new.id,7));
10 end if;
11 end;
12 /

Trigger is aangemaakt.

rwijk@ORA11G> begin
2 mypkg.print_time_and_text(' start');
3 update child
4 set enddate = enddate - 1
5 ;
6 end;
7 /
07:07:24.781000000: start
07:07:30.171000000: 50000 +000000000 00:00:05.390000000
07:07:36.046000000: 100000 +000000000 00:00:05.875000000
07:07:41.515000000: 150000 +000000000 00:00:05.469000000
07:07:47.640000000: 200000 +000000000 00:00:06.125000000
07:07:53.234000000: 250000 +000000000 00:00:05.594000000
07:07:59.125000000: 300000 +000000000 00:00:05.891000000
07:08:05.062000000: 350000 +000000000 00:00:05.937000000
07:08:10.921000000: 400000 +000000000 00:00:05.859000000
07:08:16.703000000: 450000 +000000000 00:00:05.782000000
07:08:22.593000000: 500000 +000000000 00:00:05.890000000
07:08:28.156000000: 550000 +000000000 00:00:05.563000000
07:08:33.953000000: 600000 +000000000 00:00:05.797000000
07:08:39.703000000: 650000 +000000000 00:00:05.750000000
07:08:46.031000000: 700000 +000000000 00:00:06.328000000
07:08:51.796000000: 750000 +000000000 00:00:05.765000000
07:08:57.828000000: 800000 +000000000 00:00:06.032000000
07:09:03.453000000: 850000 +000000000 00:00:05.625000000
07:09:09.562000000: 900000 +000000000 00:00:06.109000000
07:09:15.015000000: 950000 +000000000 00:00:05.453000000
07:09:20.859000000: 1000000 +000000000 00:00:05.844000000

PL/SQL-procedure is geslaagd.


This one is less efficient for smaller sets, but at least this solution is constant and therefore far more scalable than the previous one. The point where the dbms_lock.request procedure becomes slower seems to be a bit below 100,000 rows. This may be high enough for most tables to stick with dbms_lock.request, but it's always good to know this scalability issue.

My colleague noted that this behaviour is documented, but I could not find it in the documentation about dbms_lock. Turned out he saw the following text of the package itself:

rwijk@ORA11G> select text
2 from all_source
3 where owner = 'SYS'
4 and name = 'DBMS_LOCK'
5 order by line
6 /

TEXT
----------------------------------------------------------------------------------------------

...
package dbms_lock is
...
---------------
-- LIMITATIONS
--
-- The implementation does not support large numbers of locks efficiently.
-- A few hundred locks per session should be the limit.


So it is a known fact already. Too bad I just found out about it now ...

6 comments:

  1. Does this only happend when there are a lot of locks?

    ReplyDelete
  2. Yes, that was my point. The more locks with dbms_lock.request, the longer it takes.

    ReplyDelete
  3. I'm pretty sure that Oracle documentation for this package indicates that it is not efficient for a large number of locks. I have used this package extensively to implement mutual exclusion locks for processes that need to be serialized so the number of locks is always very low and it works quite well for this purpose.

    ReplyDelete
  4. Right, I solved this by reducing the number of locks being acquired. This works well in situations with low oder medium concurrency, but you may get more spurious locks and deadlocks:

    l_lock_dummy := dbms_lock.request ( release_on_commit => true, id => dbms_utility.get_hash_value( name => 'child' || to_char(MOD(p_parent_id,100)), base => 1, hash_size => power(2,30) ) );

    100 works well, there is nearly no penalty compared to a single lock.

    ReplyDelete
  5. It should be noted that dbms_lock.request is unscalable only within a transaction or a session (depending on the value of release_on_commit).

    If you start a transaction and then execute a for-loop requesting a big amount (e.g. 10.000) of locks in release_on_commit mode you can see a linear growth of the execution time needed for every next portion of locks (just as shown above in the article). Actually there's a clear quadratic dependence of the total exec time to the number of locks: T = k*N^2.

    But when you keep the main transaction uncomitted and start a new autonomous transaction you'll get exactly the same gradual growth of execution time starting from zero. Thus the locks in the new transaction are independent (unaffected) in sense of scalability from those in the main transaction.

    The same effect can be observed if you request locks with any value of release_on_commit parameter within different sessions.

    The case when you request non-release-on-commit locks within one session but different transaction looks more complex, but its scalability is somewhat better than when you request locks within one transaction and worse than when you request them within defferent sessions.

    So when you have lots of concurrent sessions accessing not very big amount of resources you can consider dbms_lock.request as quite a scalable locking mechanism.

    P.S. Based on personal observings. I don't know how Oracle stores the information about locks in its memory, so in some case dbms_lock.request's behavior may turn out to be totally unscalable.

    ReplyDelete
  6. Thanks Victor for sharing your insights here.

    ReplyDelete