For the parallellization to work, the groups need to be balanced. If, say, all large tables are in one group, the parallellization won't help much, because the total time wouldn't be reduced much. So the question is how to distribute the tables evenly where all groups approximately have the same total size in number of bytes. Another requirement, which came later, was to also take into account the total number of tables, and distribute the number of tables evenly as well. The schema contained 90,000 tables from a standard software application, of which most tables were empty.
Ronald supplied a create table script for me to test with, simulating the content of his dba_tables dictionary view, which is the source for his query. The test table contains 14,999 records. Here is the table and a select statement that gives you an idea what's in the table.
SQL> desc mytables Name Null? Type ------------------------------------------------------ -------- -------------- TABLE_NAME VARCHAR2(30) NUM_ROWS NUMBER AVG_ROW_LEN NUMBER SQL> select rownum 2 , nvl(num_rows,0) * nvl(avg_row_len,0) bytes 3 from mytables 4 order by bytes desc 5 / ROWNUM BYTES --------------- --------------- 1 29387074200 2 28471135400 3 23974896400 4 18589737600 5 17953177900 6 17014479300 7 10880526800 8 8832810600 9 8372641700 10 7888944000 11 7527559700 12 6314082900 13 5814484500 14 5452809600 15 5194260000 16 5160845400 17 4323377800 18 4245004800 19 4226310600 20 4196381000 ... 2256 4 2257 4 2258 3 2259 3 2260 3 2261 3 2262 3 2263 2 2264 2 2265 2 2266 2 2267 2 2268 2 2269 2 2270 2 2271 2 2272 2 2273 0 2274 0 2275 0 2276 0 ... 14995 0 14996 0 14997 0 14998 0 14999 0 14999 rows selected.
So there are a few large tables, the largest one being approximately 29GB. And most of the tables, 12727, are empty.
The algorithm I chose to distribute the tables in N groups, is the simplest one I could think of, and goes schematically like this:
- order all tables by size in descending order
- place the N largest tables in groups 1 .. N
- iterate over the tables with a non-empty size in descending order and add the table to the first group encountered whose size is below the running average size of the groups
- iterate over the empty tables and add the table to the first group encountered whose total number of tables is below the average number of tables per group
Below is the SQL statement used, with lines 59-64 being different, just to show the distribution better. The original statement would just contain a "select * from t" there.
SQL> var NUMBER_OF_GROUPS number SQL> exec :NUMBER_OF_GROUPS := 10 PL/SQL procedure successfully completed. SQL> with t as 2 ( select table_name 3 , orig_bytes 4 , grp 5 from ( select table_name 6 , nvl(num_rows,0) * nvl(avg_row_len,0) bytes 7 , count(*) over () cnt 8 from mytables 9 ) 10 model 11 dimension by (row_number() over (order by bytes desc) i) 12 measures 13 ( table_name 14 , bytes 15 , bytes orig_bytes 16 , row_number() over (order by bytes desc) grp 17 , cnt 18 , sum(bytes) over (order by bytes desc) running_sum 19 , 1 aantal 20 ) 21 rules iterate (100000) until (iteration_number+1 >= cnt[1] - :AANTAL_GROEPEN) 22 ( grp[:AANTAL_GROEPEN+1+iteration_number] 23 = case 24 when bytes[:AANTAL_GROEPEN+1+iteration_number] > 0 then 25 case 26 when bytes[1] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 1 27 when bytes[2] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 2 28 when bytes[3] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 3 29 when bytes[4] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 4 30 when bytes[5] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 5 31 when bytes[6] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 6 32 when bytes[7] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 7 33 when bytes[8] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 8 34 when bytes[9] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 9 35 when bytes[10] <= running_sum[cv()-1] / :AANTAL_GROEPEN then 10 36 end 37 else -- lege tabellen 38 case 39 when aantal[1] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 1 40 when aantal[2] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 2 41 when aantal[3] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 3 42 when aantal[4] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 4 43 when aantal[5] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 5 44 when aantal[6] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 6 45 when aantal[7] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 7 46 when aantal[8] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 8 47 when aantal[9] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 9 48 when aantal[10] <= (:AANTAL_GROEPEN+1+iteration_number ) / :AANTAL_GROEPEN then 10 49 end 50 end 51 , bytes[grp[:AANTAL_GROEPEN+1+iteration_number]] 52 = bytes[cv()] + bytes[:AANTAL_GROEPEN+1+iteration_number] 53 , aantal[grp[:AANTAL_GROEPEN+1+iteration_number]] 54 = aantal[cv()] + 1 55 ) 56 ) 57 select grp 58 , sum(orig_bytes) 59 , count(*) 60 from t 61 group by grp 62 order by grp 63 / GRP SUM(ORIG_BYTES) COUNT(*) --------------- --------------- --------------- 1 29387074200 1500 2 28471135400 1500 3 27978114708 1500 4 27978114707 1500 5 27978114707 1500 6 27978114708 1500 7 27978114706 1500 8 27978114706 1500 9 27978114707 1500 10 26076134516 1499 10 rows selected.
A few explanatory remarks. I needed some extra measures to help with the calculation of the average size of the groups:
- the running sum of bytes
- the current size of all groups
- the total number of elements
The whole statement is a tad ugly because the case expression has 10 repeating WHEN..THEN constructions, but it does the job and it allows you to play with the NUMBER_OF_GROUPS variable, which needs to be between 1 and 10. But, it is another nice use case for the SQL model clause, and it might come in handy for other DBA's or developers who need a single query to distribute their data evenly.
UPDATE:
Of course it has turned out the problem isn't new.
Here is a blogpost from Brendan Furey about the same subject.
And here is the AskTom thread he refers to with several model clause queries and one using recursive subquery factoring.
Hello Rob,
ReplyDeleteVery nice SELECT :) :)
I just asked myself whether using ROWNUM in a SELECT with "ORDER BY bytes DESC" will ensure you that the MODEL iterations do indeed process the rows
in "bytes DESC" order, since ROWNUM values are assigned *BEFORE* the
"ORDER BY bytes" happen.
So, isn't it necessary to use ROW_NUMBER() OVER(ORDER BY bytes DESC)
instead of ROWNUM in the inner SELECT ?
Of course, this is the case only if processing the tables in descending size order
by the MODEL iterations is indeed essential for the entire algorithm.
Thanks a lot & Best Regards,
Iudith Mentzel
Oracle developer & PL/SQL Challenge player
Haifa, Israel
Hello iudith,
ReplyDeleteBy processing the tables in descending size order, you considerably raise the odds that the distribution will be spread more evenly. So yes, it is quite essential.
For my colleague, I used a statement that had a (select rownum ... from (select ... order by bytes desc)), but for this blog post I thought tidying up my statement and lose a subquery would not affect the outcome. I was wrong thinking that, but just by accident the outcome wasn't affected because the rows in the table were pre-sorted.
So thank you very much for pointing it out. I've corrected the SQL statement.
Regards,
Rob.