Practical techniques to reduce model bias using pre-, in-, and post-processing methods
Bias can be reduced at three points in the ML pipeline. Each has different trade-offs between effectiveness, implementation cost, and accuracy impact.
Resample, reweight, or transform training data before training. Examples: oversample underrepresented groups, remove discriminatory features, apply reweighting to equalize group representation. Advantage: model-agnostic. Disadvantage: may discard useful information.
Add a fairness constraint or adversarial penalty to the training loss. The model learns to be fair while training. Example: adversarial debiasing trains a classifier while simultaneously training an adversary to predict the sensitive attribute from the classifier's representations.
Adjust model outputs or thresholds after training. Example: use different classification thresholds for different groups to equalize false positive rates. Advantage: doesn't require retraining. Disadvantage: group membership must be known at serving time.
from fairlearn.postprocessing import ThresholdOptimizer
from sklearn.linear_model import LogisticRegression
# Train base model (unconstrained)
base_model = LogisticRegression()
base_model.fit(X_train, y_train)
# Post-processing: optimize thresholds to satisfy equalized odds
mitigated_model = ThresholdOptimizer(
estimator=base_model,
constraints="equalized_odds",
objective="accuracy_score",
predict_method="predict_proba",
)
mitigated_model.fit(X_train, y_train, sensitive_features=S_train)
# Now predictions satisfy equalized odds (approximately)
y_pred_mitigated = mitigated_model.predict(X_test, sensitive_features=S_test)
eod_before = equalized_odds_difference(y_test, base_model.predict(X_test), sensitive_features=S_test)
eod_after = equalized_odds_difference(y_test, y_pred_mitigated, sensitive_features=S_test)
print(f"Equalized odds difference before: {eod_before:.3f}")
print(f"Equalized odds difference after: {eod_after:.3f}")