This morning I saw an e-mail with the title "SQL puzzle?" in my inbox. That's a good title to make me read that one first. The poster was stunned by the results of the following two queries:
SQL> SELECT COUNT(DISTINCT bdl.nrl_id)
2 FROM a_table bdl
3 WHERE bdl.periode >= to_date('20081023092701','yyyymmddhh24miss')
4 ;
COUNT(DISTINCTBDL.NRL_ID)
-------------------------
61111
1 rij is geselecteerd.
SQL> SELECT COUNT(*)
2 FROM (SELECT DISTINCT bdl.nrl_id
3 FROM a_table bdl
4 WHERE bdl.periode >= to_date('20081023092701','yyyymmddhh24miss')
5 )
6 ;
COUNT(*)
----------
61112
1 rij is geselecteerd.
Of course it's possible that a record was inserted in the meantime, but that was not the case here.
The explanation was quite simple. See this example:
rwijk@ORA11GR1> select comm from emp
2 /
COMM
----------
300
500
1400
0
14 rijen zijn geselecteerd.
rwijk@ORA11GR1> select distinct comm from emp
2 /
COMM
----------
1400
500
300
0
5 rijen zijn geselecteerd.
rwijk@ORA11GR1> select count(*) from (select distinct comm from emp)
2 /
COUNT(*)
----------
5
1 rij is geselecteerd.
rwijk@ORA11GR1> select count(distinct comm) from emp
2 /
COUNT(DISTINCTCOMM)
-------------------
4
1 rij is geselecteerd.
Again those pesky null values. From the documentation:
"All aggregate functions except COUNT(*), GROUPING, and GROUPING_ID ignore nulls."A distinct doesn't ignore nulls, and the count(*) on top of it doesn't either.
At first glance, a "select count(distinct ...)" appears equal to a "select count(*) from (select distinct ...)", but as shown: it is not.
UPDATE
I forgot to mention that a "select count(comm) from (select distinct comm...)" [note: count(comm) instead of count(*)] is equal to a "select count(distinct comm) ...":
rwijk@ORA11GR1> select count(comm) from (select distinct comm from emp);
COUNT(COMM)
-----------
4
1 rij is geselecteerd.
great Post thank u a lot!
ReplyDeleteThis comment has been removed by the author.
ReplyDelete