According to Gartner, a modern data warehouse is mission-critical for enterprise analytics, including artificial intelligence/machine learning (AI/ML). The goal is to deliver a single source of truth to the enterprise, thereby simplifying data integration and data governance. Even with the evolution of several big data analytics platforms, the data warehouse is still considered the best way to support a wide range of data capabilities and complex business use cases.
Business requirements for the modern data warehouse
The business requirements for data are continuously growing, so is the demand for data availability on different levels of detail and data models to meet specific user or application needs. The data warehouse has become so much more than a large data store for management reporting and KPI’s, both due to increased organizational data maturity and to the increasing number of available data sources, both internal and external.
To meet the overall organizational requirements, not just for data delivery, but also for regulatory compliance and budget concerns, four key abilities need to be considered for a modern data warehouse.
- Agility (smoothness)
The ability to deliver a project in small iterative cycles, delivering piece by piece. It also includes the ability to efficiently adapt to changes within given constraints, i.e., time, resources, and budget.
- Productivity (delivery/time)
The ability to utilize resources efficiently for value-adding activities, producing high throughput of deliverables (data items). Productivity should include the entire life-cycle, i.e. from development to continuous operation.
- Robustness (stability)
The ability to continuously maintain a stable and predictable solution, with minimal downtime even as the solution changes over time. The solution should be operationally stable and predictable, i.e. robust, resulting in minimum time spent on operational tasks. Robustness is key for trusting data.
- Risk reduction (safe)
The solution should be designed to handle change in personnel, change in data sources, and confirm to regulatory compliances. Data should be traceable and documentation should be autogenerated to always be up to date.
The inevitable compromise
Data integration tools that enable ELT or ETL are typically designed to be generic to handle any form of data integration. Being a Jack of all trades, these tools risk being Master of none when it comes to data warehouse development and operation. Data integration tools leave it up to the data warehouse developers or data engineers to define the processes in the data life-cycle, determine precisely how to perform each process, including which practices to apply and how to apply them.
Given the constraints (Budget, Time, Resources, Strategy/Goals) of a given project and solution, developers will have to make compromises when working with these generic ETL tools. Typically, one must decide which of the four key principles above weighs the most, as designing for all will break the constraints.
If robustness and data quality is the most important factor, one must typically start off with defining project/solution standards, patterns and approaches to each of the processes in the development life-cycle, before being able to deliver any business value. This is often maintained as separate documents or as a custom-built solution framework. Since the solution framework is not managed as part of the data integration software, it becomes the responsibility of the organization, i.e. the developer(s) to maintain and develop it over time. This will reduce agility and increase risk. Also, the reverse scenario applies; if Agility is a priority Robustness will suffer. The solution will become increasingly complex and lack standardization which also increases risk and decreases productivity.
Automation to mitigate the compromise
A Data Warehouse Automation (DWA) tool does not only automate certain tasks within a development project but looks at the Data Warehouse (or Platform) Solution entire life-cycle from initial development, DevOps, Operation and Maintenance, Change-handling and expansion.
Looking at the four key abilities of a data warehouse, we will now explain how automation with Xpert BI mitigates the compromises. It is the Xpert BI metadata-based foundation, grouped with built-in best-practices and flexible but controllable architecture which enables agility, productivity, robustness and risk reduction to co-exist.
Agility – With Xpert BI it is easy to both change and expand the solution. Functionalities such as metadata-checks, ‘one-click’ approach to add new data, change-analysis through the dependency graph, and end-to-end lineage are important here.
Productivity – Xpert BI handles the repeatable metadata-driven tasks and lets developers work at a higher abstraction level. This increases productivity in both development and operation phases. Xpert BI manages all table generations, all data load and dependencies, end-to-end technical documentation and out-of-the box load optimization. This enables developers to focus on developing business logic and data models to meet the end-user requirements.
Robustness – Xpert BI enables change and growth with minimum impact on operation and solution complexity. Standardization of solution design principles, data source management, load execution processes and logging, Built-in DataOps and migration wizard are some of the functionalities that enable robustness automatically.
Risk Reduction – With Xpert BI all developers work within the same design principles. This reduces both risks regarding dependency of key personnel and risk regarding the overall solution quality, maintainability and longevity.
When the data warehouse delivers on business requirements, not only the first time - but every time, the data warehouse solution will also become more valuable and give a higher return on investment. Additionally, we see that successful data delivery almost always inspires new use-cases and innovation in how to use data in a different way or adding more data into the data warehouse or platform and thereby increasing the value of the solution even further. Thereby, automation enables a positive spiral of utilizing data.
Elevate your ambitions
The box of ‘Limited Resources’ above includes in practice the determining factors for an organization’s data warehouse/data platform/digitalization ambitions. How these factors are utilized is up to each organization to decide. We encourage every organization to investigate and challenge design principles and which trade-offs are made within the solution.
At BI Builders we have extensive experience from numerous data warehouse projects. In our experience, at least 80% of the business value of a data warehouse is delivered over time through change rather than based on the original requirements. Thus, adopting agile principles to efficiently handle changes while ensuring enduring productivity, robustness, and minimized risks is key to success with a data warehouse.
From our experience, a data warehouse automation tool will let you elevate your ambitions for your data initiatives, enabling data consumption and analysis in a wider organizational scope.