Sharan Initiatives — AI, Finance, Photography & More

Introduction to Model Deployment

Moving from Jupyter notebooks to production requires careful planning and infrastructure. **Deployment Pipeline:** 1. **Model Training**: Train and validate model 2. **Model Serialization**: Save model to disk 3. **API Development**: Create endpoint to serve predictions 4. **Containerization**: Package model and dependencies 5. **Deployment**: Deploy to cloud/server 6. **Monitoring**: Track performance and errors **Deployment Patterns:** **Batch Prediction** - Process large datasets offline - Scheduled jobs (daily, hourly) - Use case: Recommendation systems, fraud detection **Real-time Prediction** - Low latency (<100ms) - REST API or gRPC - Use case: Chatbots, real-time pricing **Edge Deployment** - Run on device (mobile, IoT) - Reduced latency, privacy - Use case: Face recognition, voice assistants **Streaming** - Process data streams - Kafka, Kinesis - Use case: Anomaly detection, monitoring

Model Serialization

Save and load trained models:

python

import torch
import torch.nn as nn
import pickle
import joblib
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
import json
import os

# 1. PyTorch Model Serialization
class SimpleNN(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, output_dim)
        )
    
    def forward(self, x):
        return self.network(x)

# Create and save PyTorch model
model = SimpleNN(10, 64, 2)
model_path = "pytorch_model.pth"

# Save entire model
torch.save(model, model_path)
print(f"✓ Saved complete PyTorch model to {model_path}")

# Save state dict (recommended)
torch.save(model.state_dict(), "pytorch_state_dict.pth")
print(f"✓ Saved PyTorch state dict")

# Load model
loaded_model = torch.load(model_path)
loaded_model.eval()
print(f"✓ Loaded PyTorch model")

# 2. Scikit-learn Model Serialization
X, y = make_classification(n_samples=1000, n_features=10, random_state=42)
sklearn_model = LogisticRegression()
sklearn_model.fit(X, y)

# Save with joblib (preferred for sklearn)
joblib.dump(sklearn_model, "sklearn_model.joblib")
print(f"\n✓ Saved sklearn model with joblib")

# Save with pickle
with open("sklearn_model.pkl", "wb") as f:
    pickle.dump(sklearn_model, f)
print(f"✓ Saved sklearn model with pickle")

# Load model
loaded_sklearn = joblib.load("sklearn_model.joblib")
print(f"✓ Loaded sklearn model")

# 3. Save Model Metadata
metadata = {
    "model_type": "LogisticRegression",
    "features": ["feature_" + str(i) for i in range(10)],
    "training_date": "2026-01-18",
    "accuracy": 0.95,
    "version": "1.0.0",
    "preprocessing": {
        "scaler": "StandardScaler",
        "missing_value_strategy": "mean_imputation"
    }
}

with open("model_metadata.json", "w") as f:
    json.dump(metadata, f, indent=2)
print(f"\n✓ Saved model metadata")

# 4. Model Versioning Best Practices
def save_versioned_model(model, version, base_dir="models"):
    """Save model with version control"""
    os.makedirs(base_dir, exist_ok=True)
    
    # Save model
    model_path = f"{base_dir}/model_v{version}.joblib"
    joblib.dump(model, model_path)
    
    # Save metadata
    metadata = {
        "version": version,
        "timestamp": "2026-01-18T10:30:00",
        "model_path": model_path,
        "metrics": {"accuracy": 0.95, "f1": 0.93}
    }
    
    metadata_path = f"{base_dir}/model_v{version}_metadata.json"
    with open(metadata_path, "w") as f:
        json.dump(metadata, f, indent=2)
    
    return model_path, metadata_path

model_path, meta_path = save_versioned_model(sklearn_model, "1.2.0")
print(f"\n✓ Saved versioned model:")
print(f"  Model: {model_path}")
print(f"  Metadata: {meta_path}")

# 5. Model Comparison
print(f"\n--- Serialization Methods ---")
formats = {
    "PyTorch (.pth)": "Complete model or state dict",
    "Joblib": "Fast, sklearn optimized",
    "Pickle": "General Python, less efficient",
    "ONNX": "Cross-framework, production",
    "TorchScript": "PyTorch production deployment",
    "TensorFlow SavedModel": "TF serving format"
}

for format_name, description in formats.items():
    print(f"  {format_name}: {description}")

print(f"\n✓ Model serialization complete!")

Output:

✓ Saved complete PyTorch model to pytorch_model.pth
✓ Saved PyTorch state dict
✓ Loaded PyTorch model

✓ Saved sklearn model with joblib
✓ Saved sklearn model with pickle
✓ Loaded sklearn model

✓ Saved model metadata

✓ Saved versioned model:
  Model: models/model_v1.2.0.joblib
  Metadata: models/model_v1.2.0_metadata.json

--- Serialization Methods ---
  PyTorch (.pth): Complete model or state dict
  Joblib: Fast, sklearn optimized
  Pickle: General Python, less efficient
  ONNX: Cross-framework, production
  TorchScript: PyTorch production deployment
  TensorFlow SavedModel: TF serving format

✓ Model serialization complete!

Cloud Deployment Options

**Major Cloud Platforms:** **AWS SageMaker** - Managed ML service - Built-in algorithms and frameworks - Auto-scaling endpoints - Model monitoring **Google Cloud AI Platform** - Vertex AI for MLOps - TensorFlow optimization - AutoML capabilities - BigQuery ML integration **Azure ML** - Designer for no-code ML - MLflow integration - Kubernetes deployment - Real-time and batch inference **Deployment Strategies:** **Canary Deployment** - Route small % traffic to new model - Monitor performance - Gradually increase traffic - Rollback if issues **Blue-Green Deployment** - Two identical environments - Switch traffic between them - Instant rollback capability - Zero downtime **Shadow Deployment** - New model receives traffic copy - Predictions not served to users - Compare with production model - Safe testing

Model Serving with TorchServe

Deploy PyTorch models with TorchServe: