Data Solution Design Patterns

If we can generate everything ...

Working with data can be complex, and often the ‘right’ answer for the purpose is the result of a series of iterations where business subject matter experts (SMEs) and data professionals collaborate.

This is an iterative process by its very nature. Even with the best effort and available knowledge, the resulting data model will be subject to progressive understanding that is inherent in working with data.

In other words, the data solution model is not always something you always can get right in one go. In fact, it can take a long time for a model to stabilise, and in the current fast-paced environments this may even never be the case.

Choosing the right design patterns for your data solution helps maintain both the mindset and capability for the solution to keep evolving with the business, the technology, and to reduce technical debt on an ongoing basis.

This mindset also enables some truly fascinating opportunities such as the ability to maintain version control of the data model, the design metadata, and their relationship - to be able to represent the entire data solution as it was at a certain point in time - or to even allow different data models for different business domains.

This idea, combined with the capability to automatically (re)deploy different structures and interpretations of data as well as the data logistics to populate or deliver these we call ‘Data Solution Virtualisation’.

The idea of an automated virtual data solution was conceived while working on improvements for generating Data Warehouse loading processes. It is, in a way, an evolution in ETL generation thinking. Combining Data Vault with a Persistent Staging Area (PSA) provides additional functionality because it allows the designer to refactor all, or parts, of the solution.

Being able to deliver a virtual data solution provides options. It does not mean you have to virtualise the entire solution, but you can pick-and-choose which approach works best for the given scenario and change technologies and models over time.

Providing a direct connection to data

To allow ideas to grow, creators need an immediate connection to what they are creating. This means that, as a creator, you need to be able to directly see what the effect of your changes are on what you are working on.

This is what the virtual data solution as a concept and mindset intends to enable: to enable a direct connection to data to support any kind of exploration and enabling creativity while using it.

Thinking of Data Warehousing in terms of virtualisation is in essence about following the guiding principle to establish a direct connection to data. It is about finding ways to seek simplification, to keep working on removing barriers to deliver data and information. It is about enabling ideas to flourish because data can be made available for any kind of discovery or assertion.

A virtual data warehouse

Virtual Data Warehousing is the ability to present data for consumption directly from a raw data store by leveraging data warehouse loading patterns, information models and architecture. In many data solutions, it is already considered a best practice to be able to ‘virtualise’ Data Marts in a similar way. The Virtual Data Warehouse takes this approach one step further by allowing the entire data solution to be refactored based on the original raw transactions.

This ability requires a Persistent Staging Area (PSA), also known as a Persistent Historized Data Store, where the data that is received is stored as it has been received, at the lowest level of detail. If data is retained this way, everything you do with your data can always be repeated at any time – deterministically. In the best implementations, the virtual data solution allows you to work at the level of simple metadata mappings, modelling and interpretation "business logic", abstracting away the more technical details.

Not the same as Data Virtualization

A virtual data solution is not the same as data virtualisation. These two concepts are fundamentally different. Data virtualisation, by most definitions, is the provision of unified direct access to data across many ‘disparate’ data stores.

It is a way to access and combine data without having to physically move the data across environments. Data virtualisation does not however focus on loading patterns and data architecture and modelling.

The virtual data solution, on the other hand, is a flexible and manageable approach towards solving data integration and time variance topics using data warehouse concepts, essentially providing a defined schema-on-read.

Adapting your data platform indefinitely

The Virtual Data Warehouse is enabled by virtue of combining the principles of data logistics generation, hybrid data warehouse modelling concepts and a Persistent Staging Area (PSA). It is a way to create a more direct connection to the data because changes made in the metadata and models can be immediately represented in the information delivery.

Persisting data in a more traditional Data Warehouse sense is always still an option, and may be required to deliver the intended performance. The deterministic nature of a Virtual Data Warehouse allows for dynamic switching between physical and virtual structured, depending on the requirements.

In many cases, this mix of physical and virtual objects in the Data Warehouses changes over time itself, when business focus changes. A good approach is to ‘start virtual’, and persist where required.

Download Brochure

Implementation and automation for a flexible data solution

Training with Roelant Vos

"For a data warehouse, we do not have enough time."

What can Data Solution Automation offer?

Your Trainer

"The training is great and worth every cent. I can only recommend it to everybody!"

BERNHARD LAUBER, trivadis

"Thanks Roelant, I really enjoyed the course. So much content and lots of interesting discussions."

ANDREAS HAAS

"The 3-day course has more than fulfilled my expectations."

DANI SCHNEIDER

You want to ...

Prerequisites

Is this for me?

Flexible design and implementation

Training content and schedule

Day 1

Day 2

Day 3

Practical content

Needed Software

Training available world-wide

Dates & Prices

Registration

If you have any further questions, please contact us:

info@dwhpatterns.com

Data Solution Design Patterns

Implementation and automation for a flexible data solution

Training with Roelant Vos

"For a data warehouse, we do not have enough time."

What can Data Solution Automation offer?

Your Trainer

"The training is great and worth every cent. I can only recommend it to everybody!"

BERNHARD LAUBER, trivadis

"Thanks Roelant, I really enjoyed the course. So much content and lots of interesting discussions."

ANDREAS HAAS

"The 3-day course has more than fulfilled my expectations."

DANI SCHNEIDER

You want to ...

Prerequisites

Is this for me?

Flexible design and implementation

Training content and schedule

Day 1

Day 2

Day 3

Practical content

Needed Software

Training available world-wide

Dates & Prices

Registration

Presented by

IMPRINT

GENERAL INFORMATION

Area of Responsibility

Definition of Scope

No Liability

PRIVACY POLICY

Who is responsible for data processing?

Which data is collected, for what and by whom?

Web server

Usage statistics

Enrollment form

What rights does the user have?

IMAGE SOURCE

If you have any further questions, please contact us:

info@dwhpatterns.com