How to Model Inheritance in a Relational Database

In the process of designing our entity relationship diagram for a database, we may find that attributes of two or more entities overlap. In this case, we may create a subtype of the parent entity that contains distinct attributes and supertype entity. The article presents how to model inheritance and what are the features of each solution.

In the process of designing our entity relationship diagram for a database, we may find that attributes of two or more entities overlap, meaning that these entities seem very similar but still have a few differences. In this case, we may create a subtype of the parent entity that contains distinct attributes. A parent entity becomes a supertype that has a relationship with one or more subtypes.

First, let’s take a closer look at a simple class diagram.

The UML symbol for a subclass association is an open arrowhead that points to the parent class.

The subclass association line is labeled with specialization constraints. Constraints are described along two dimensions:

  1. incomplete/complete
    • In an incomplete specialization only some instances of the parent class are specialized (have unique attributes). Other instances of the parent class have only the common attributes.
    • In a complete specialization, every instance of the parent class has one or more unique attributes that are not common to the parent class.
  2. disjoint/overlapping
    • In a disjoint specialization, an object could be a member of only one specialized subclass.
    • In an overlapping specialization, an object could be a member of more than one specialized subclass.

The following diagram presents class Client.

In class Client we distinguish two subtypes: Individual and Company. This specialization is disjoint (client can be an individual or a company) and complete (these are all possible subtypes for supertype).

Inheritance

Let’s model this situation and discuss the results (I will use Vertabelo, our online database modeling tool).

One table implementation

In a one table implementation, table client has attributes of both types.

The diagram below shows the table client and two views: individual and company:

In this implementation:

  • Access to supertype rows is optimal (it is simple to have a list of all clients, it’s not necessary to make costly joins)
  • Effectiveness problem with access to subtype rows. Some rows have to be discarded, because one table contains rows for all (in this case two) subtypes.
  • There has to be an additional attribute to specify the subtype (attribute ‘type’ may have value ‘i’ for ‘individual’ and ‘c’ for ‘company’ and no other value).
  • It’s easy to change the object’s subtype (we have to change ‘type’ from ‘i’ to ‘c’ or the other way around).
  • Many attributes are subtype-specific, and these columns have null values on rows that apply to other subtype’s attributes.

Two-table implementation

In a two-table implementation, we create a table for each of the subtypes. Each table gets a column for all attributes of the supertype and also a column for each attribute belonging to the subtype. Access to information in this situation is limited, that’s why it is important to create a view that is the union of the tables. We can add an additional attribute called ‘type’ that describes the subtype.

The diagram below presents two tables, individual and company, and a view (the blue one) called client.

The view’s script is as follows:

However, the example above generates some problems:

  • problem with implementation UID for supertype. There is no common ID.
  • if we want to change object’s type, for example, company to client, we need to do many INSERT, DELETE operations

Three-table implementation

In a third solution we create a single table client_t for the parent table, containing common attributes for all subtypes, and tables for each subtype (individual_t and company_t) where the primary key in client_t (base table) determines foreign keys in dependent tables. There are three views: client, individual and company.

In this situation:

  • it’s necessary to make (too) many (too) costly joins,
  • changing the subtype is not effective,
  • there is no problem with UIDs, because UIDs for subtypes are determined by the supertype’s UID.
  • It’s very easy to search against all subtypes (dependent table’s foreign key is also a primary key)

Tools in this post

Redgate Data Modeler

Design, update, and manage database schemas

Find out more