Sharan Initiatives — AI, Finance, Photography & More

The Tuning Ladder

Most practitioners waste time running hundreds of random trials before finding a good configuration. A better approach is to work up the tuning ladder: start cheap, get directional signal, then invest compute on the most promising region of the search space.

Tuning methods ranked by efficiency

- **Grid search**: Exhaustive but exponentially expensive. Only for 1-2 hyperparameters with few options - **Random search**: 60-70% of grid search quality at 10% of the cost. Good default starting point - **Bayesian optimization**: Builds a probabilistic model of the objective and samples intelligently. 3-5x more efficient than random for expensive training runs - **Population-based training (PBT)**: Evolves multiple training runs in parallel. State-of-the-art for very long training jobs

Bayesian Optimization with Optuna

Optuna is the best library for Bayesian hyperparameter optimization. It integrates with PyTorch, sklearn, XGBoost, and more.

python

import optuna
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_val_score

def objective(trial):
    params = {
        "n_estimators": trial.suggest_int("n_estimators", 50, 500),
        "learning_rate": trial.suggest_float("learning_rate", 1e-4, 0.3, log=True),
        "max_depth": trial.suggest_int("max_depth", 3, 10),
        "subsample": trial.suggest_float("subsample", 0.5, 1.0),
        "min_samples_split": trial.suggest_int("min_samples_split", 2, 20),
    }
    clf = GradientBoostingClassifier(**params, random_state=42)
    score = cross_val_score(clf, X_train, y_train, cv=3, scoring="roc_auc").mean()
    return score

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=50, n_jobs=-1)

print(f"Best AUC: {study.best_value:.4f}")
print(f"Best params: {study.best_params}")

Output:

[I 2024-01-15 10:23:41] Trial 0 finished with value: 0.8834
[I 2024-01-15 10:23:42] Trial 1 finished with value: 0.8912
...
Best AUC: 0.9241
Best params: {'n_estimators': 287, 'learning_rate': 0.0412, 'max_depth': 6, ...}

Hyperparameter Tuning Strategies

The Tuning Ladder

Tuning methods ranked by efficiency

Bayesian Optimization with Optuna