Sharan Initiatives — AI, Finance, Photography & More

The Problem That Feature Stores Solve

Without a feature store, every team recomputes the same features independently. The fraud team computes 'transactions in last 30 days'. The recommendations team computes 'purchases in last 30 days'. They're the same feature — but computed separately, stored separately, and potentially inconsistently. Feature stores centralize feature computation, storage, and serving. One team defines the feature; everyone else uses it from the store. More importantly, the store guarantees that training-time features and serving-time features are computed the same way — eliminating training-serving skew.

Core concepts

- **Feature view**: A logical group of features from one data source (e.g., user_features from the users table) - **Online store**: Low-latency key-value store (Redis, DynamoDB) for serving predictions in real time - **Offline store**: Column-oriented store (Parquet, BigQuery) for training data retrieval - **Materialization**: The process of computing features and loading them into the stores

Model Registry with MLflow

MLflow's model registry tracks experiments, versions, and stage transitions (Staging → Production).

python

import mlflow
import mlflow.sklearn

mlflow.set_tracking_uri("http://mlflow-server:5000")
mlflow.set_experiment("fraud-detection-v2")

with mlflow.start_run():
    # Train your model
    model = train_model(X_train, y_train)

    # Log parameters and metrics
    mlflow.log_params({"model_type": "xgboost", "n_estimators": 200})
    mlflow.log_metrics({"auc": 0.934, "f1": 0.891})

    # Log and register the model
    mlflow.sklearn.log_model(
        model,
        artifact_path="model",
        registered_model_name="fraud-detector",
    )

# Transition to production (after human review)
client = mlflow.MlflowClient()
client.transition_model_version_stage(
    name="fraud-detector",
    version=3,
    stage="Production",
)

Feature Stores & Model Registry

The Problem That Feature Stores Solve

Core concepts

Model Registry with MLflow