Or: the consequences of not using many-to-many relationships
Is Moroccan food African or Mediterranean? We designed a simple hierarchical model of cuisines that grouped them by broad geographic area. The purpose of the model was solely to populate a configurable, organized list for a select menu, where smaller areas were visually grouped.
Keeping things simple, we created a cuisine table with id
, name
, and parent
columns:
Column | Type | Modifiers
-----------+----------+--------------------------
id | integer | primary key
name | string | not null
parent_id | integer | foreign key to cuisine(id)
Indexes:
"cuisine_pkey" PRIMARY KEY, btree (id)
"cuisine_name" UNIQUE CONSTRAINT, btree (name)
Foreign-key constraints:
"cuisine_parent_id_fkey" FOREIGN KEY (parent_id) REFERENCES cuisine(id)
Cuisine names could not be duplicated, and each cuisine had zero or one parents. Which worked great, until we realized that Morocco was in the data list twice.
We had a couple options: create a cuisine join table that allowed for many parents of a single cuisine, or allow duplicates in the name field. The latter option seemed far more expeditious, so we changed the unique index to:
"cuisine_name_parent_id_key" UNIQUE CONSTRAINT, btree (name, parent_id)
Which again worked, until we realized that SQL doesn’t apply the unique constraint if parent_id
is NULL, and so duplicate names that didn’t have a parent were being allowed.
Digging in deeper, we discovered partial indexes in PostgeSQL, which can apply a unique constraint under certain conditions determined by a WHERE clause. We now have two unique indexes applied to the name column
:
"cuisine_name_parent_id_key" UNIQUE CONSTRAINT, btree (name, parent_id) "cuisine_name_key" UNIQUE, btree (value) WHERE parent_id IS NULL
And thus we avoid an internecine food fight (at the cost of reduced SQL transportability).