About Oracle: COUNT DISTINCT

This morning I saw an e-mail with the title "SQL puzzle?" in my inbox. That's a good title to make me read that one first. The poster was stunned by the results of the following two queries:

SQL> SELECT COUNT(DISTINCT bdl.nrl_id)
  2  FROM   a_table bdl
  3  WHERE  bdl.periode >= to_date('20081023092701','yyyymmddhh24miss')
  4  ;

COUNT(DISTINCTBDL.NRL_ID)
-------------------------
                    61111

1 rij is geselecteerd.

SQL> SELECT COUNT(*)
  2  FROM  (SELECT DISTINCT bdl.nrl_id
  3         FROM   a_table bdl
  4         WHERE  bdl.periode >= to_date('20081023092701','yyyymmddhh24miss')
  5        )
  6  ;

  COUNT(*)
----------
     61112

1 rij is geselecteerd.

Of course it's possible that a record was inserted in the meantime, but that was not the case here.

The explanation was quite simple. See this example:

rwijk@ORA11GR1> select comm from emp
  2  /

      COMM
----------

       300
       500

      1400




         0





14 rijen zijn geselecteerd.

rwijk@ORA11GR1> select distinct comm from emp
  2  /

      COMM
----------

      1400
       500
       300
         0

5 rijen zijn geselecteerd.

rwijk@ORA11GR1> select count(*) from (select distinct comm from emp)
  2  /

  COUNT(*)
----------
         5

1 rij is geselecteerd.

rwijk@ORA11GR1> select count(distinct comm) from emp
  2  /

COUNT(DISTINCTCOMM)
-------------------
                  4

1 rij is geselecteerd.

Again those pesky null values. From the documentation:

"All aggregate functions except COUNT(*), GROUPING, and GROUPING_ID ignore nulls."

A distinct doesn't ignore nulls, and the count(*) on top of it doesn't either.

At first glance, a "select count(distinct ...)" appears equal to a "select count(*) from (select distinct ...)", but as shown: it is not.

UPDATE

I forgot to mention that a "select count(comm) from (select distinct comm...)" [note: count(comm) instead of count(*)] is equal to a "select count(distinct comm) ...":

rwijk@ORA11GR1> select count(comm) from (select distinct comm from emp);

COUNT(COMM)
-----------
          4

1 rij is geselecteerd.

About Oracle

Pages

Friday, January 30, 2009

COUNT DISTINCT

UPDATE

2 comments: