top of page

Choosing the Right Data Model: Balancing Performance, Maintenance, and Reporting Capabilities

  • Lionel
  • Mar 23, 2023
  • 3 min read

When designing a data architecture, one of the key considerations for data architects is determining the best approach for handling current and historical data. This decision can significantly impact the overall performance, maintenance, and reporting capabilities of the data system. It's essential to carefully weigh the factors involved and choose the approach that best aligns with the business requirements and priorities.


In designing the ultimate data model for reporting analytics, data architects often face the fundamental consideration of deciding between a single data store for both current and historical data or separate Operational Data Stores (ODS) for current data and Data Warehouses for historical data.


Recently, I encountered a discussion around this topic when faced with a requirement to report project status stored in Microsoft Project Online (cloud) with combined data from an on-prem application. The user wanted to see the most updated project status information (schedule, risk, and issue) within 15 minutes of it being updated. To achieve this, we pulled data from Microsoft Project Online multiple times an hour, ensuring that we always had the most current project status information.


In addition, there was also a requirement for weekly and monthly project status reports. For this historical purpose, we took weekly and monthly snapshots of the data, so we had a historical view of how projects progressed over time.


So, what factors should we consider when deciding whether to have one data store for both current and historical data or to have separate ODS and Data Warehouse structures?


Let's first look at the option of having a single data model.


Pros:

One of the primary reasons for people choosing this solution is to have one data store, simplifying maintenance by having only one data model. This is true, but only if there are no performance issues with updating the data warehouse multiple times an hour.

A second reason is that users only have one source of data. However, this may not be an advantage if users require separate current and historical views. Additionally, there are ways to combine two data stores into one.


Cons:

One obvious drawback is that two data sets with different timing requirements are mixed in one. Reporting for current data is more time-critical, which is why it is updated multiple times an hour, while reporting for historical data only needs to be updated once a week.

Another drawback is that processing historical data is more process-intensive, as you need to figure out the logic of slowly changing dimensions multiple times an hour, which is unnecessary for current data.


Now let's consider having separate models for current and historical data.


Pros:

For data modelers, this is an obvious solution. The requirement for current data is an ODS, while historical data is stored in Data Warehouses. Therefore, this calls for two different data structures. This allows for a separation of processing for current data from historical data. Since there is less processing for current data, it can be done faster and more frequently, while processing historical data can be done once a week.


Cons:

There are two different models to maintain, so there are two different data loads to develop: one for current data and the other for historical data.

In conclusion, the decision to have a combined or separate data model ultimately depends on the business requirements. The factors to consider include timing requirements, processing intensity, and maintenance complexity. As data architects, we must carefully evaluate these factors and choose the best solution to meet our clients' needs.


Ready to take your data architecture to the next level? Contact us today to learn how we can help you unlock valuable business insights with the right data model.




Comments


Commenting has been turned off.
SumeruData Main Logo 800x600.jpg
Business Intelligence and Data Analytics Consulting 

925-385-8168

2710 Crow Canyon Rd #1036

San Ramon, CA 94583

Follow Us On :

  • Linkedin
bottom of page