Using Workflow Patterns to Manage the State of Any Entity

Have you ever come across a situation where you need to manage the state of an entity that changes over time? There are many examples out there. Let’s start with an easy one: merging customer records.

Have you ever come across a situation where you need to manage the state of an entity that changes over time? There are many examples out there. Let’s start with an easy one: merging customer records.

Suppose we are merging lists of customers from two different sources. We could have any of the following states arise: Duplicates Identified – the system has found two potentially duplicate entities; Confirmed Duplicates – a user validates the two entities are indeed duplicates; or Confirmed Unique – the user decides the two entities are unique. In any of these situations, the user only has a yes-no decision to make.

But what about more complex situations? Is there a way to define the actual workflow between states? Read on…

How Things Can Easily Go Wrong

Many organizations need to manage job applications. In a simple model, you could have a table called JOB_APPLICATION, and you could track the state of the application using a reference data table containing values like these:

Application Status
APPLICATION_RECEIVED
APPLICATION_UNDER_REVIEW
APPLICATION_REJECTED
INVITED_TO_INTERVIEW
INVITATION_DECLINED
INVITATION_ACCEPTED
INTERVIEW_PASSED
INTERVIEW_FAILED
REFERENCES_SOUGHT
REFERENCES_ACCEPTABLE
REFERENCES_UNACCEPTABLE
JOB_OFFER_MADE
JOB_OFFER_ACCEPTED
JOB_OFFER_DECLINED
APPLICATION_CLOSED

These values can be selected in any order at any time. It relies on end-users to ensure that a logical and correct selection is made at each stage. Nothing prohibits an illogical sequence of states.

For example, let’s say that an application has been rejected. The current status would obviously be APPLICATION_REJECTED. There’s nothing that can be done at the application level to prevent an inexperienced user from subsequently selecting INVITED_TO_INTERVIEW or some other illogical state.

What’s needed is something to guide the user into selecting the next logical state, something that defines a logical workflow.

And what if you have different requirements for different types of job applications? For example, some jobs may require the applicant to take an aptitude test. Sure, you can add more values to the list to cover these, but there’s nothing in the current design that prevents the end-user from making an incorrect selection for the type of application in question. The reality is that there are different workflows for different contexts.

Another point to think about: are the listed options really all states? Or are some in fact outcomes? For example, the offer of a job can be accepted or rejected by the applicant. Therefore, JOB_OFFER_MADE really has two outcomes: JOB_OFFER_ACCEPTED and JOB_OFFER_DECLINED.

Another outcome could be that a job offer is withdrawn. You may want to record the reason why it was withdrawn using a qualifier. If you just add these reasons to the above list, nothing guides the end-user into making logical selections.

So really, the more complex the states, outcomes, and qualifiers become, the more you need to define the workflow of a process.

Organizing Processes, States, and Outcomes

It’s important to understand what’s going on with your data before you attempt to model it. You may at first be inclined to think that there is a strict hierarchy of types here: processstateoutcome.

When we look more closely at the above example, we see that the INVITED_TO_INTERVIEW and the JOB_OFFER_MADE states share the same possible outcomes, namely ACCEPTED and DECLINED. This tells us there is a many-to-many relationship between states and outcomes. This is often true for other states, outcomes, and qualifiers.

At a conceptual level, then, this is what is actually going on with our metadata:

Class conceptual

If you were to transform this model to the physical world using the standard approach, you would have tables called PROCESS, STATE, OUTCOME, and QUALIFIER; you would also need to have intermediate tables between them – PROCESS_STATE, STATE_OUTCOME, and OUTCOME_QUALIFIERto resolve the many-to-many relationships. This complicates the design.

While the logical hierarchy of levels (process → state → outcome → qualifier) must be maintained, there is a simpler way to physically organize our metadata.

The Workflow Pattern

The diagram below defines the main components of a workflow database model:

The yellow tables on the left contain workflow metadata, and the blue tables on the right contain business data.

The first thing to point out is that any entity can be managed without requiring major changes to this model. The YOUR_ENTITIY_TO_MANAGE table is the one under workflow management. In terms of our example, this would be the JOB_APPLICATION table.

Next, we simply need to add the wf_state_type_process_id column to whatever table we want to manage. This column points to the actual workflow process being used to manage the entity. This is not strictly a foreign key column, but it allows us to quickly query WORKFLOW_STATE_TYPE for the correct process. The table that will contain the state history is MANAGED_ENTITY_STATE. Again, you would choose your own specific table name here and modify it for your own requirements.

The Metadata

The different levels of workflow are defined in WORKFLOW_LEVEL_TYPE. This table contains the following:

Type Key Description
PROCESS High level workflow process.
STATE A state in the process.
OUTCOME How a state ends, its outcome.
QUALIFIER An optional, more detailed qualifier for an outcome.

WORKFLOW_STATE_TYPE and WORKFLOW_STATE_HIERARCHY form a classic Bill of Materials (BOM) structure. This structure, which is very descriptive of an actual manufacturing bill of materials, is quite common in data modelling. It can define hierarchies or be applied to many recursive situations. We’re going to make use of it here to define our logical hierarchy of processes, states, outcomes, and optional qualifiers.

Before we can define a hierarchy, we need to define the individual components. These are our basic building blocks. I’m just going to reference these by TYPE_KEY (which is unique) for the sake of brevity. For our example, we have:

Workflow Level Type Workflow State Type.Type Key
OUTCOME PASSED
OUTCOME FAILED
OUTCOME ACCEPTED
OUTCOME DECLINED
OUTCOME CANDIDATE_CANCELLED
OUTCOME EMPLOYER_CANCELLED
OUTCOME REJECTED
OUTCOME EMPLOYER_WITHDRAWN
OUTCOME NO_SHOW
OUTCOME HIRED
OUTCOME NOT_HIRED
STATE APPLICATION_RECEIVED
STATE APPLICATION_REVIEW
STATE INVITED_TO_INTERVIEW
STATE INTERVIEW
STATE TEST_APTITUDE
STATE SEEK_REFERENCES
STATE MAKE_OFFER
STATE APPLICATION_CLOSED
PROCESS STANDARD_JOB_APPLICATION
PROCESS TECHNICAL_JOB_APPLICATION

Now we can start to define our hierarchy. This is where we take our building blocks and define our structure. For each state, we define the possible outcomes. In fact, it’s a rule of this workflow system that each state must end with an outcome:

Parent Type – STATES Child Type – OUTCOMES
APPLICATION_RECEIVED ACCEPTED
APPLICATION_RECEIVED REJECTED
APPLICATION_REVIEW PASSED
APPLICATION_REVIEW FAILED
INVITED_TO_INTERVIEW ACCEPTED
INVITED_TO_INTERVIEW DECLINED
INTERVIEW PASSED
INTERVIEW FAILED
INTERVIEW CANDIDATE_CANCELLED
INTERVIEW NO_SHOW
MAKE_OFFER ACCEPTED
MAKE_OFFER DECLINED
SEEK_REFERENCES PASSED
SEEK_REFERENCES FAILED
APPLICATION_CLOSED HIRED
APPLICATION_CLOSED NOT_HIRED
TEST_APTITUDE PASSED
TEST_APTITUDE FAILED

Our processes are simply a set of states that each exist for a period of time. In the table below they are presented in a logical order, but this does not define the actual order of processing.

Parent Type – PROCESSES Child Type – STATES
STANDARD_JOB_APPLICATION APPLICATION_RECEIVED
STANDARD_JOB_APPLICATION APPLICATION_REVIEW
STANDARD_JOB_APPLICATION INVITED_TO_INTERVIEW
STANDARD_JOB_APPLICATION INTERVIEW
STANDARD_JOB_APPLICATION MAKE_OFFER
STANDARD_JOB_APPLICATION SEEK_REFERENCES
STANDARD_JOB_APPLICATION APPLICATION_CLOSED
TECHNICAL_JOB_APPLICATION APPLICATION_RECEIVED
TECHNICAL_JOB_APPLICATION APPLICATION_REVIEW
TECHNICAL_JOB_APPLICATION INVITED_TO_INTERVIEW
TECHNICAL_JOB_APPLICATION TEST_APTITUDE
TECHNICAL_JOB_APPLICATION INTERVIEW
TECHNICAL_JOB_APPLICATION MAKE_OFFER
TECHNICAL_JOB_APPLICATION SEEK_REFERENCES
TECHNICAL_JOB_APPLICATION APPLICATION_CLOSED

There’s an important point to make regarding a BOM hierarchy. Just as a physical bill of materials defines assemblies and sub-assemblies down to the smallest components, we have a similar arrangement in our hierarchy. This means that we get to reuse ‘assemblies’ and ‘sub-assemblies’.

By way of example: Both the STANDARD_JOB_APPLICATION and TECHNICAL_JOB_APPLICATION processes have the INTERVIEW state. In turn, the INTERVIEW state has the PASSED, FAILED, CANDIDATE_CANCELLED, and NO_SHOW outcomes defined for it.

When you use a state in a process, you automatically get its child outcomes with it because it’s already an assembly. This means that the same outcomes exist for both types of job application at the INTERVIEW stage. If you want different interview outcomes for different types of job applications, you need to define, say, TECHNICAL_INTERVIEW and STANDARD_INTERVIEW states that each have their own specific outcomes.

In this example, the only difference between the two types of job applications is that a technical job application includes an aptitude test.

Before You Go

Part 1 of this two-part article has introduced the workflow database pattern. It has shown how you can incorporate it to manage the lifecycle of any entity in your database.

Part 2 will show you how to define the actual workflow using additional configuration tables. This is where the user will be presented with allowable next steps. We’ll also demonstrate a technique for getting around the strict reuse of ‘assemblies’ and ‘sub-assemblies’ in BOMs.

Tools in this post

Redgate Data Modeler

Design, update, and manage database schemas

Find out more