Monday, September 14, 2009

Check constraints and AND

Remember the opening statement of the MV error series? It was "Never put more than one error condition into one error message". Two months later I realized I didn't practice what I preach myself.

I'm currently busy developing a Tour de France application at home. The application has to store the standings. After each stage, a daily and overall standing becomes available for each of the following standing types: Individual, point, mountain and team. I'm storing standing lines and I chose a supertype implementation: all in one table. Which means the table looks something like this:

rwijk@ORA11GR1> create table standing_lines
2 ( stage_id number(6) not null
3 , standing_type varchar2(2) not null
4 , overall_or_daily varchar2(1) not null
5 , position number(3) not null
6 , rider_id number(6)
7 , team_id number(6)
8 , duration interval day(2) to second(0)
9 , number_of_points number(4)
10 , constraint pk primary key (stage_id,standing_type,overall_or_daily,position)
11 , constraint ck1 check (standing_type in ('IN','PO','MO','TE'))
12 , constraint ck2 check (overall_or_daily in ('D','O'))
13 )
14 /

Table created.

Note that I deliberately left out some unique, foreign key and other check constraints. For each standing type, different columns should be mandatory and other columns should stay null. For example: an individual standing line should contain a rider_id and a duration, and no team_id and no number_of_points. So it seems logical to code some additional check constraints like this:

rwijk@ORA11GR1> alter table standing_lines add constraint ck3 check
2 ( standing_type != 'IN'
3 or ( rider_id is not null
4 and team_id is null
5 and duration is not null
6 and number_of_points is null
7 )
8 )
9 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck4 check
2 ( standing_type != 'PO'
3 or ( rider_id is not null
4 and team_id is null
5 and duration is null
6 and number_of_points is not null
7 )
8 )
9 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck5 check
2 ( standing_type != 'MO'
3 or ( rider_id is not null
4 and team_id is null
5 and duration is null
6 and number_of_points is not null
7 )
8 )
9 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck6 check
2 ( standing_type != 'TE'
3 or ( rider_id is null
4 and team_id is not null
5 and duration is not null
6 and number_of_points is null
7 )
8 )
9 /

Table altered.

But when I started coding routines to read some clob content pasted from the official Tour de France site, parsing all standing lines, and inserting the result into the standing_lines table, the inevitable occurred: it didn't work the first time. I received an error message like this:

ORA-02290: check constraint (RWIJK.CK6) violated

So I looked up the definition of this check constraint and all I could say was that I was trying to insert a team standing line and I gave rider_id a value, or I left team_id null, or I left duration null, or I gave number_of_points a value. That's four options. And it's certainly not one error condition per error message. To resolve this error, I first redefined my check constraints and split them into four each.

The first rewrite is by applying the distributivity rule: A or (B and C) == (A or B) and (A or C). In Oracle check constraint logic, it looks like this:

rwijk@ORA11GR1> alter table standing_lines drop constraint ck3
2 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck4
2 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck5
2 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck6
2 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck3 check
2 ( ( standing_type != 'IN' or rider_id is not null )
3 and ( standing_type != 'IN' or team_id is null )
4 and ( standing_type != 'IN' or duration is not null )
5 and ( standing_type != 'IN' or number_of_points is null )
6 )
7 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck4 check
2 ( ( standing_type != 'PO' or rider_id is not null )
3 and ( standing_type != 'PO' or team_id is null )
4 and ( standing_type != 'PO' or duration is null )
5 and ( standing_type != 'PO' or number_of_points is not null )
6 )
7 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck5 check
2 ( ( standing_type != 'MO' or rider_id is not null )
3 and ( standing_type != 'MO' or team_id is null )
4 and ( standing_type != 'MO' or duration is null )
5 and ( standing_type != 'MO' or number_of_points is not null )
6 )
7 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck6 check
2 ( ( standing_type != 'TE' or rider_id is null )
3 and ( standing_type != 'TE' or team_id is not null )
4 and ( standing_type != 'TE' or duration is not null )
5 and ( standing_type != 'TE' or number_of_points is null )
6 )
7 /

Table altered.

And now it becomes clear we can split them into four parts like this:

rwijk@ORA11GR1> alter table standing_lines drop constraint ck3
2 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck4
2 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck5
2 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck6
2 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck3 check
2 ( standing_type != 'IN' or rider_id is not null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck4 check
2 ( standing_type != 'IN' or team_id is null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck5 check
2 ( standing_type != 'IN' or duration is not null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck6 check
2 ( standing_type != 'IN' or number_of_points is null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck7 check
2 ( standing_type != 'PO' or rider_id is not null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck8 check
2 ( standing_type != 'PO' or team_id is null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck9 check
2 ( standing_type != 'PO' or duration is null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck10 check
2 ( standing_type != 'PO' or number_of_points is not null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck11 check
2 ( standing_type != 'MO' or rider_id is not null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck12 check
2 ( standing_type != 'MO' or team_id is null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck13 check
2 ( standing_type != 'MO' or duration is null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck14 check
2 ( standing_type != 'MO' or number_of_points is not null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck15 check
2 ( standing_type != 'TE' or rider_id is null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck16 check
2 ( standing_type != 'TE' or team_id is not null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck17 check
2 ( standing_type != 'TE' or duration is not null )
3 /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck18 check
2 ( standing_type != 'TE' or number_of_points is null )
3 /

Table altered.

I reran my code and the error message became:

ORA-02290: check constraint (RWIJK.CK17) violated

Ah! I tried to insert a team standing line, but failed to give duration a value. Now that's more like it.

This is a very specific example, but you can say in general:
Whenever you have a check constraint and a top-level AND operator, you should split the constraint.

To know whether your expressions contains a top-level AND operator, it's essential you can calculate with boolean expressions. I mentioned the distributivity rule already above, and in that link you find some more useful rules. And I find I apply De Morgan's laws very often as well. These allow you to rewrite your expressions containing less mandatory brackets, so the AND and OR operators will go up in the hierarchy.

Whether you want to apply this rule for a BETWEEN operator (which also contains an AND :-) ) is subjective.

4 comments:

  1. Hi Rob,

    Wouldn't it be simpler if you use the new (11g) LOG ERRORS clause in your INSERT ? I have not tested this feature yet but it might be interesting in your case.

    ReplyDelete
  2. Could not have said this better.
    I preach this in my book. And not only for tuple-constraints....

    Nice post!

    Toon

    ReplyDelete
  3. Hi Vincent,

    I'm not sure why you mention the LOG ERRORS clause. The post is not about collecting all errors and having an INSERT succeed despite the errors. It is about being able to know exactly what went wrong when an insert fails on a check constraint violation. Or am I missing something?

    Regards,
    Rob.

    ReplyDelete