Feature Store
A centralized data platform that manages the creation, storage, sharing, and serving of ML features for both model training and online inference.
A Feature Store is a specialized data system that sits between raw data sources and machine learning models, managing the engineering, storage, retrieval, and governance of features — the transformed and aggregated data attributes used to train and serve ML models. Without a feature store, data scientists in different teams repeatedly engineer the same features from raw data, leading to duplicated work, inconsistent feature definitions, and dangerous discrepancies between features computed at training time and those computed at inference time — a class of errors known as training-serving skew.
Feature stores address two fundamentally different serving patterns. The offline store holds historical feature values used for model training and batch inference. It is typically implemented on a data warehouse or data lake — BigQuery, Snowflake, S3-backed Parquet — and optimized for high-throughput retrieval of large feature datasets keyed by entity and time. The online store holds current feature values for low-latency real-time inference, implemented on key-value stores such as Redis, DynamoDB, or Cassandra, and optimized for single-entity lookups measured in milliseconds. A feature store coordinates writes to both stores, ensuring consistency, and provides a unified API for reading features regardless of whether the request is for training data or real-time serving.
Feature reuse is one of the most significant value propositions of a feature store. When a feature — say, a customer's average transaction value over the last 30 days — is engineered once, validated, and published to the feature store, every team building models that could benefit from that feature can discover and use it without re-engineering it. This not only reduces work but also ensures that all models using the same feature are computing it identically, eliminating a source of model discrepancy that can be difficult to debug. Feature catalogs with search, documentation, lineage, and ownership metadata make this discovery and reuse practical at scale.
From a compliance standpoint, feature stores provide critical infrastructure for model explainability and auditability. When a regulatory body or internal audit team investigates a model decision, the feature store's historical log can reconstruct exactly what feature values were presented to the model at the time of inference. This point-in-time feature reconstruction is essential for meeting requirements under GDPR's right to explanation, the EU AI Act's transparency provisions, and financial services regulations requiring explainable credit and underwriting decisions. Feature lineage — the chain of transformations from raw data to model-ready feature — also helps demonstrate compliance with data minimization principles, showing that only necessary data is used.
Compliance-Native Architecture Guide
Design principles and a structured checklist for building software that is compliant by default — not compliant by retrofit. Covers data architecture, access controls, audit trails, and vendor due diligence.