Sharan Initiatives — AI, Finance, Photography & More

The Four Core Fairness Definitions

There are dozens of proposed fairness metrics, but four are most commonly used in practice. Understanding what each measures — and what it doesn't — is essential.

Demographic parity (statistical parity)

P(Ŷ=1 | A=0) = P(Ŷ=1 | A=1) The positive prediction rate should be equal across groups A (e.g., race, gender). Easy to measure. Doesn't account for whether underlying rates differ. Appropriate when you believe qualification rates should be equal across groups.

Equalized odds

P(Ŷ=1 | Y=1, A=0) = P(Ŷ=1 | Y=1, A=1) AND P(Ŷ=1 | Y=0, A=0) = P(Ŷ=1 | Y=0, A=1) Both true positive rates AND false positive rates should be equal across groups. Stronger than demographic parity. The COMPAS standard ProPublica was applying.

Calibration

P(Y=1 | Ŷ=p, A=0) = P(Y=1 | Ŷ=p, A=1) = p A prediction of 70% risk should mean 70% probability of the outcome, regardless of group membership. The standard Northpointe was applying.

Individual fairness

Similar individuals should receive similar predictions. Requires a similarity metric over individuals — which is itself contentious.

Measuring Fairness with Fairlearn

python