Design pattern: repeated attributes (the phone book)

The contact manager example from our preceeding discussion of subkeys is also an excellent illustration of another problem that is found in many database designs.

• Obviously, the contacts database will need to store phone numbers in addition to addresses. A typical simplistic model, even after fixing the zip code problem, might look something like this:

Contact phonebook class diagram

• Again, this might seem like a reasonable design until you look at the data (omitting the street and zip to reduce table width):

Contact phones
firstNamelastNamehomePhoneworkPhonecellPhonefaxpager
GeorgeBarnes562-874-1234310-999-3628
SusanNoble562-975-3388714-847-3366
ErwinStar714-997-5885714-997-2428
AliceBuck562-577-1200562-561-1921
FrankBorders714-968-8201
HannaDiedrich562-786-7727

• There are at least two very large problems here:

- None of our contacts are going to have all of the phone numbers that we've provided for. In fact, most contacts will have only one or two numbers—leaving most of these fields blank, or NULL. Remember that NULL is a special constant value in database systems that means “this field doesn’t have any value assigned to it.” It’s not the same as a zero length string or the number zero. In general, we want to eliminate unnecessary NULL values that might occur as a result of our design—and the NULL values in this table are definitely unnecessary.

- No matter how many phone number fields we provide (five of them here), sooner or later someone will think of another kind of phone number that we need. With the model above, we would have to actually change the table structure to add another field. Any information like this that might change should be represented by data in a table rather than by the table structure.

• In fact, we don’t have five single attributes here. We have a repeated attribute, phone, that also has an attribute of its own that tells us what type of number it is (home, work, cell, and so on). In effect, it’s a class within a class. Some database textbooks call this structure a weak entity, since it can’t exist without the parent entity type.

• In UML, we can show multiplicity of attributes the same way we show multiplicity of an association (for example [0..*]). We can also show the data type of an attribute, which in this case is a structure (PhoneNumber). We have listed the structure attributes below in parentheses.

Contact phonebook class diagram, revised

If we try to represent information this way in the Contacts table, we’ll end up with a subkey that we obviously don’t want. We have to create a new table in much the same way as we did for the zip locations.

1. Remove all of the phone number fields from the Contacts relation. Create a new scheme that has the attributes of the PhoneNumber structure (phoneType and number).

2. The Contacts relation has now become a parent, so we should add a surrogate key, contactID. Copy this into the new scheme, so that we can associate the phone number with the person it belongs to. There is now a one-to-many relationship between Contacts and PhoneNumbers. Notice that this is the opposite of the relationship between Contacts and ZipLocations.

3. To identify each phone number, we need to know at least who it belongs to and what type it is. However, to allow for any combination of contacts, phone types, and numbers, we will use all three attributes of the new scheme together as the primary key. It’s not a parent, so we aren’t concerned about size of the PK.

Contact phonebook relation scheme

• The Contacts table now looks like it did before we added the phone numbers (with the addition of the contactID). The new PhoneNumbers table can be joined to the Contacts on matching contactID pk-fk pairs to provide all of the information that we had before.

Contacts
contactidfirstnamelastnamestreetzipcode
1639GeorgeBarnes1254 Bellflower90840
5629SusanNoble1515 Palo Verde90840
3388ErwinStar17022 Brookhurst92708
5772AliceBuck3884 Atherton90836
1911FrankBorders10200 Slater92708
4848HannaDiedrich1699 Studebaker90840

Phone numbers
contactidphonetypenumber
1639Home562-874-1234
1639Cell310-999-3628
5629Home562-975-3388
5629Work714-847-3366
3388Fax714-997-5885
3388Pager714-997-2428
5772Work562-577-1200
5772Cell562-561-1921
1911Home714-968-8201
4848Cell562-786-7727

Employee dependents

The modeling technique shown above is useful where the parent class has relatively few attributes and the repeated attribute has only one or a very few attributes of its own. However, you can also model the repeated attribute as a separate class in the UML diagram. One classic textbook example is an employee database. The employee class represents “a person who works for our company”; each employee has zero or more dependents. The dependent is conceptually a repeated attribute of the employee, but can be described separately as “a person who is related to the employee and may receive health care or other benefits based on this relationship.” We can represent this fact in the class diagram:

Employee dependent class diagram

• The relation scheme is a standard one-to-many; the PK of the many-side relation will have to include both the fk from the parent and one or more local attributes to guarantee uniqueness.

Employee dependent relation scheme