Contents

Modelling

Data Modelling

Data modelling is the process of designing, schematizing, and diagramming a faithful and informative representation of your organizational activity. Building and maintaining an effective and informative data model is an essential ingredient in a healthy and efficient enterprise information system. We view data modelling as an act of communication which answers the question: "What should the data tell the data consumer?" Data by itself is just noise, really, so you must focus the data to form an effective picture of organizational activity and present it in a way which is easily consumable, informative, and actionable by end users.

Building an effective data model requires a deep understanding of the data you're working with, which technology you're building on, and the business itself. Building logically complex, robust, and complete data models is a time and logic intensive operation which often requires input from many different stakeholders up and down the organization.

A data model is a product of the data, and the data are a product of your business activity. The business drives the data which drives the model which informs management whom then make business decisions. This is an essential feedback loop for data driven organizations. An effective data model will enable understanding, knowledge generation, and decisive action. However, an ineffective or malformed model will only sow confusion and slow down decision making.

Businesses are increasingly looking to gather as much data as possible while still acting as nimbly as possible. This requires new methodologies, tools, and new cultural approaches to engineering. We believe this trend is leading to a paradigm shift in how business work with data which we call DAAS - Data As A Service. Although the concept of DAAS has been around for a while it hasn't been fully realized until cloud data warehouses matured.

Before cloud data warehouses most businesses invested in large enterprise RDBMS enterprise systems. These usually were large Kimball style warehouses which fed into OLAP cubes for data modelling. However, this system is ill suited to cloud technology because cloud technology, which is more or less based upon Hadoop architecture, does not process data in the same way as an RDBMS. Cloud data warehouses do not operate as a single large monolithic data structure, but rather they are distributed systems which separate compute from storage. This fact alone is enough to completely rethink the way we process data in the cloud compared to monolithic RDBMS data warehouses.

All of these changes from DAAS to cloud architecture are coalescing and resulting in a need to completely rethink data architecture, management, processing, and delivery. Our effort is to not only stay ahead of these trends, but hopefully, to help lead them as well.