Data Modelling Status

Justin James
3 min readOct 17, 2021

Some of the most-terrible data models I see on a regular basis happened because the data model started to evolve, and no one said, “Wait! We can’t merely add new fields, we need to do something differently!” One of the common examples of this is when we see an Entity filled with Boolean Attributes that represent different stages, or gates, of a process.

This is an example I threw together in my Personal Environment, but I see stuff like this pretty often

The are some advantages and disadvantages to this kind of data model, but truly the advantages are so meager and the disadvantages so strong, this is an anti-pattern.

Pros

  • Back in the days of Application Objects being part of licensing, Attributes were “free” and Static Entities counted towards the license.
  • Because these usually start as one or two Attributes, then grow and evolve into quite a few, switching to a reference to a Static Entity record that represents status would require a data migration, and dependent apps to be updated to use the new status reference.
  • If you need to display “this is what has already been done” or “these steps have not yet been done”, you have a convenient set of Boolean Attributes to show it.

Cons

  • Indexing in the database will be… interesting.
  • Queries can be quite slow because they need to filter on tons of Attributes.
  • Adding a new flag Attribute means having to find all of the logic in the system that derives status, and update it.
  • It is way too easy to make a mistake somewhere with the logic and have one place that fails for some use cases, and you end up with random failures that are hard to track down.
  • You end up writing “what’s my status?” functions.

The bottom line is that the advantages are mostly those of convenience, while the disadvantages are going to cause performance problems, difficulty in maintenance, and technical debt. Yikes!

Mitigation

“OK Justin, I’m convinced! I’ll do this going forwards, I promise. But I already did this on an Entity, how do I get out of this mess?” Glad you asked! It is not too hard to fix this:

  1. Add a Static Entity for each of the statuses. You may want to consider adding Boolean Attributes to the Entity for the various “gates” or factors that can play into the status, that way if you need to display or filter on “this has been done” you can join to the Static Entity and pull those Booleans from there. An Order field may be helpful if you want to show the steps in order as well. I often end up adding application-specific logic/data to the Static Entity as well.
  2. Add the needed status records to the Static Entity.
  3. Add a reference to the Static Entity in the data Entity, and make it mandatory. It is my strong recommendation to add a Description telling the developer that this field gets set by the CRUD Wrappers, and to never try setting it in code.
  4. In our CRUD Wrappers, add logic for the status. The Upsert will want to have the logic for determining what the status value will be; I like to put this after the validations but before the “is this a new record or an existing record?” decision. The Remove may want to change a status, or perhaps the GetCanRemove will want to block removal if the status has a certain value.
  5. Write a “When Published” Timer to go through all existing records with an empty StatusId, determine the correct Status, and save it. I will typically do this in a way that bypasses the CRUD Wrapper, because I don’t want to change the Updated On/Updated By metadata with a data migration.
  6. Update all dependent code to use the new Status information.
  7. Once everything has tested clear and been deployed to Production, then you can remove the existing Boolean Attributes.

And there you go! This is one of the easiest refactorings you can do to a data model, yet it yields big rewards in terms of fewer bugs, easier code writing, and better performance.

J.Ja

--

--