Sharan Initiatives — AI, Finance, Photography & More

Three Levels of Pipeline Testing

ML pipelines are harder to test than regular software because 'correct' is often probabilistic. But you can still catch the majority of bugs with three categories of tests:

Level 1 — Unit tests (fast, no data)

Test each transformation function in isolation with tiny synthetic DataFrames. Verify shapes, column names, dtypes, and that known inputs produce known outputs. These run in milliseconds and should be in your pre-commit hook.

Level 2 — Data validation tests (medium, sample data)

Use Great Expectations or Pandera to assert schema contracts: no unexpected nulls, values within expected ranges, no duplicate IDs, referential integrity. Run these on every pipeline execution.

Level 3 — Integration tests (slow, real pipeline run)

Run the full pipeline on a small representative slice of real data end-to-end. Verify the output has the expected shape and statistical properties (mean, std within expected bounds).

Writing Pipeline Unit Tests with pytest

python

import pandas as pd
import pytest
from your_pipeline import transform

@pytest.fixture
def sample_df():
    return pd.DataFrame({
        "user_id": [1, 2, 3],
        "amount": [10.0, 0.0, 500.0],
        "timestamp": ["2024-01-01", "2024-01-02", "2024-01-03"],
    })

def test_transform_adds_expected_columns(sample_df):
    result = transform(sample_df)
    assert "amount_log" in result.columns
    assert "date" in result.columns

def test_transform_no_nulls(sample_df):
    result = transform(sample_df)
    assert result.notna().all().all()

def test_transform_amount_log_non_negative(sample_df):
    result = transform(sample_df)
    assert (result["amount_log"] >= 0).all()

def test_transform_rejects_negative_amounts():
    bad_df = pd.DataFrame({"user_id": [1], "amount": [-5.0], "timestamp": ["2024-01-01"]})
    with pytest.raises(AssertionError):
        transform(bad_df)

Testing & Validating ML Pipelines

Three Levels of Pipeline Testing

Level 1 — Unit tests (fast, no data)

Level 2 — Data validation tests (medium, sample data)

Level 3 — Integration tests (slow, real pipeline run)

Writing Pipeline Unit Tests with pytest