Data mesh promises domain autonomy — each team owns, publishes, and operates their data products as first-class API surfaces. The principle works cleanly in greenfield analytics platforms. It fractures the moment a domain data product contains PHI, PII under GDPR, or financial records subject to SOX. The problem is not that data mesh is wrong. The problem is that compliance obligations are cross-cutting concerns that the domain ownership model does not natively handle.
Where the Domain Model Breaks
Consider a healthcare organisation implementing data mesh. The clinical domain owns patient encounter records. The billing domain processes those records to generate claims. The analytics domain queries both to produce population health reports for CMS. HIPAA's minimum necessary standard (§164.502(b)) requires that access to PHI be limited to the minimum necessary for the purpose. In a pure data mesh model, where domain data products are self-describing and independently queryable, enforcing minimum necessary across domain boundaries requires policy infrastructure the mesh doesn't provide by default.
The same pattern repeats in financial services. A trading domain data product may contain individual transaction records that the risk domain aggregates and the compliance domain audits. SOX Section 404 requires that the controls over financial reporting data be documented end-to-end. A data mesh where each domain defines its own access controls produces a fragmented control environment that makes SOX ITGC documentation nearly impossible without a centralised policy layer.
The core architectural insight: in regulated industries, data mesh requires a compliance domain that is orthogonal to — and has authority over — all other domains. This compliance domain does not own data products. It owns policies, enforces them at the mesh fabric level, and produces the audit evidence that regulators require.
The Compliance Domain Architecture
Implementing a compliance domain in data mesh requires three components: a policy decision point (PDP) that all domain data product access goes through, an attribute-based access control (ABAC) model that encodes regulatory requirements as machine-readable policies, and a centralised audit log bus that every domain data product writes to on every access. The PDP must understand regulatory context — not just "is this user authorised" but "does this access satisfy minimum necessary for the stated purpose." This is not a standard IAM function. It requires custom policy logic.
The ABAC model must encode regulation-specific attributes. For HIPAA: PHI classification labels on every field, purpose-of-access claims in every request, minimum-necessary scope definitions per use case. For GDPR Article 5: data minimisation constraints, purpose limitation enforcement, and retention period metadata. For PCI DSS: cardholder data environment (CDE) boundary enforcement. These attributes must be maintained as data products evolve — which means compliance domain ownership of the data catalogue schema, not just data product schemas.
Tooling That Supports Regulated Data Mesh
- Apache Atlas or DataHub for data catalogue with compliance attribute tagging
- Open Policy Agent (OPA) as the PDP — Rego policies can encode HIPAA minimum necessary logic, GDPR purpose limitation, and PCI CDE boundaries
- Immutable audit log infrastructure (e.g., Apache Kafka with S3 sink and WORM bucket policies) attached to every domain data product API
- Data product schema governance requiring compliance domain sign-off on schema changes that affect regulated fields
- Automated PHI/PII discovery tooling (Presidio, AWS Comprehend Medical) to flag new regulated fields before they cross domain boundaries untagged
The Cross-Domain Lineage Requirement
HIPAA, GDPR, and PCI DSS all require demonstrable data lineage — the ability to show where regulated data originated, where it flowed, and who accessed it. In a data mesh, lineage crosses domain boundaries and involves data products that may be versioned independently. The compliance domain must maintain a cross-domain lineage graph, not just per-domain lineage. OpenLineage (the CNCF standard) provides an event-based lineage model that works with Spark, dbt, Airflow, and custom pipelines — and can be extended to tag events with regulatory metadata. Building OpenLineage integration into the data mesh fabric as a deployment requirement, not an afterthought, is what makes cross-domain lineage auditable.
The engineering behind this article is available as a service.
We have done this work — not advised on it, not reviewed documentation about it. If the problem in this article is your problem, the first call is with a senior engineer who has solved it.