Data Mesh vs Azure – Theory vs practice
Use the tag Data Mesh vs Azure to follow this blog series.
As a reminder, the four data mesh principals:
- domain-oriented decentralised data ownership and architecture.
- data as a product.
- self-serve data infrastructure as a platform.
- federated computational governance.
Source reference: https://martinfowler.com/articles/data-mesh-principles.html
A Quick Recap
In Part 1, Part 2 and Part 3 of this blog series we defined our Azure data mesh nodes and edges. With the current conclusion that Azure Resource Groups can house our data product nodes within the mesh and for our edges (interfaces) we’ve established the following working definitions and potential Azure resources:
- Primary – data integration/exchange. Provided by a SQL Endpoint.
- Secondary – operational reporting and logging. Provided by Azure Log Analytics as a minimum.
- Tertiary – resource (network) connectivity. Delivered using Azure VNet Peering in a hub/spoke setup.
Then in Part 4 of the series we explored and concluded how we could template and control the deployment of our nodes using Azure DevOps, Azure Bicep and VS Code.
In part 5, we defined the hierarchy of data domains vs data products and aligned this thinking to Azure Subscriptions vs Azure Resource Groups.
In part 6, we explored the problem and current solution thinking regarding the multi plane components that could be used to deliver an analytical reporting model. With some questions left unanswered in terms of technology (at the moment).
For part 7 of this series, I want to explore what else could be delivered in our Azure Data Mesh if we continue our established thinking around the planes of interaction for our data products. As with part 6, we are still missing good Azure Resources that can deployed for certain situations. However, I want to frontload some concepts now, so we are ready if/when a suitable technical answer arrives in the cloud.
This post is therefore going to break my mandate slightly of the theory vs the practice because when we come to implementation, for what I am proposing there isn’t currently a packaged product offering.
What are you talking about Paul?!
Ok. This is what I am thinking about and why I needed a blog post to organise my thoughts. And as before, wrong answers are going to be part of the learning here as we figure these things out.
Introducing the SaaS Plane
In my current data mesh logical view, we have the IaaS plane, and the PaaS plane. We explored these in the context of our data product interfaces in part 3 of the Data Mesh vs Azure blog series. All fine, and I stand by those conclusions, (in my opinion) this remains a good logical separation of the resources we deploy in our potential Azure data mesh architecture.
What I now want to draw is the next logical plane of interaction for this technology stack. The SaaS plane.
To reflect, our data mesh theory from martinfowler.com talks about:
- A data infrastructure plane
- A data product developer experience plane
- A mesh supervision plane
A like this, but we need to make it real.
In my version, we align at the infrastructure level (to a degree). We align regarding the data interfaces delivered by the ‘data product developer experience’, simplified to a SQL endpoint in my Azure implementation. However, the ‘mesh supervision plane’, remains a little fluffy and slightly abstract. What is it for and how should we use it?
Let us start with an update to the picture, adding the SaaS plane:
Note: the icon I’ve used in the SaaS plane has been borrowed from the old Power BI Visio stencil. It seemed like a good fit for now. Don’t read too much into it.
This drawing hopefully helps with our understanding and what I’ve been banging on about and how those planes of my delivered data mesh architecture fit. Of course, drawing them on the right-hand side doesn’t really work when I already have the PaaS and IaaS planes featured in the main stack. So, to improve accuracy this could be drawn like the below picture. To fully inform the intent of the SaaS plane (20% image transparency to save the day).
A SaaS plane that correctly sits on top of, and potentially across all other Azure data mesh Resources. A plane that abstracts the rich de-centralised mesh eco-system to a level that doesn’t require specialised engineering skills.
SaaS Plane Capabilities
Inspired by how Microsoft talk about Azure Synapse Analytics as the single pane of glass, or the unified experience for analytics, I think we could use this to inform what our SaaS plane brings in terms of capabilities.
A Synapse single pane of glass to a data mesh single plane user experience, if we want to play with words 🙂
Anyway, lets continue, making it real. What should the SaaS plane have, I’m going to start with the following three capabilities:
- A Mesh Marketplace – building on our catalogue of assets introduced in part 4 of the series. Let’s now extend this potential “user frontend” to include our data assets, our models, information about the contents of our data products. Even data lineage and entity diagrams. All things that you might see via Azure Purview or another governance tool. But going beyond the initial technical offerings and making this a landing page to search our entire data mesh architecture, enriched with a knowledge graph for navigation.
- A Self-Service Analytics Canvas – building on the thinking from part 5 of the series. Data exploration needs to be easy. Sure, via SQL endpoints from the PaaS plane, but better. Afterall, writing SQL could be considered a specialised skill so we need something else. For me, this should be natural language. An environment with enough metadata and business logic baked it that allows users to answer questions articulated through speech. The technology is emerging to deliver this, but, hasn’t really taken off in data analytics yet. I think mainly because we have/are still busy handling the ‘big data’ inputs and haven’t got to a state of data outputs. Other than creating dashboards.
- An Onboarding Framework – lets continue the thinking regarding the delivery of data products as a service. The SaaS plane isn’t just about our architecture outputs, it should be used to onboard new data products as well. From the marketplace, why not allow new data mesh nodes to be provisioned into a chosen domain. This means we can rapidly scale and apply all of our mesh product governance at deployment time. A data product defined by a framework and established as a series of drop-down menu’s used to set configuration. Data product (software) as a service.
Because we like pictures…
As always friends, thoughts and debate welcome here.
SaaS Plane Initial Conclusions
If the ultimate goals of the data mesh is to scale and reduce our reliance on technical skills. The only way I currently see us getting there is by reaching a SaaS plane state of deliverables.
The highly technical skills build everything we need underneath and even the SaaS plane frontend itself. But once done, velocity and time to insight will be greatly increased alongside the ability to scale the mesh to all areas of the business in a highly connected platform.
Sounds great, but I’m not here just to blow smoke. I will be building this too and putting it into practice. Technology choices for the SaaS plane to be confirmed. Concepts front loaded.
Many thanks for reading.