⏱️ 50 min

Perceptrons and Activation Functions

Understanding the building blocks of neural networks

The Perceptron

The perceptron is the simplest neural network unit. It takes multiple inputs, applies weights, adds a bias, and produces an output through an activation function. **Formula:** y = f(w·x + b) Where f is the activation function.

Implementing a Perceptron

Build from scratch:

python
import numpy as np

class Perceptron:
    def __init__(self, n_inputs, learning_rate=0.1):
        self.w = np.random.randn(n_inputs) * 0.01
        self.b = 0
        self.lr = learning_rate
    
    def activation(self, z):
        # Step function
        return 1 if z >= 0 else 0
    
    def predict(self, x):
        z = np.dot(self.w, x) + self.b
        return self.activation(z)
    
    def train(self, X, y, epochs=10):
        for epoch in range(epochs):
            errors = 0
            for xi, yi in zip(X, y):
                prediction = self.predict(xi)
                error = yi - prediction
                
                # Update weights
                self.w += self.lr * error * xi
                self.b += self.lr * error
                
                errors += abs(error)
            
            print(f"Epoch {epoch + 1}, Errors: {errors}")
            if errors == 0:
                print("Converged!")
                break

# AND gate training data
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 0, 0, 1])

# Train perceptron
perceptron = Perceptron(n_inputs=2)
perceptron.train(X, y)

# Test
print("\nTesting AND gate:")
for xi in X:
    print(f"Input: {xi} -> Output: {perceptron.predict(xi)}")
Output:
Epoch 1, Errors: 2
Epoch 2, Errors: 1
Epoch 3, Errors: 0
Converged!

Testing AND gate:
Input: [0 0] -> Output: 0
Input: [0 1] -> Output: 0
Input: [1 0] -> Output: 0
Input: [1 1] -> Output: 1

Activation Functions

Implement common activation functions:

python
import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def tanh(x):
    return np.tanh(x)

def relu(x):
    return np.maximum(0, x)

def leaky_relu(x, alpha=0.01):
    return np.where(x > 0, x, alpha * x)

def softmax(x):
    exp_x = np.exp(x - np.max(x))  # Stability trick
    return exp_x / exp_x.sum()

# Test activations
x = np.array([-2, -1, 0, 1, 2])
print("Input:", x)
print(f"\nSigmoid: {sigmoid(x)}")
print(f"Tanh: {tanh(x)}")
print(f"ReLU: {relu(x)}")
print(f"Leaky ReLU: {leaky_relu(x)}")

# Softmax for classification
logits = np.array([2.0, 1.0, 0.1])
probs = softmax(logits)
print(f"\nLogits: {logits}")
print(f"Softmax probabilities: {probs}")
print(f"Sum: {probs.sum()}")
Output:
Input: [-2 -1  0  1  2]

Sigmoid: [0.1192 0.2689 0.5000 0.7311 0.8808]
Tanh: [-0.9640 -0.7616  0.0000  0.7616  0.9640]
ReLU: [0 0 0 1 2]
Leaky ReLU: [-0.02 -0.01  0.00  1.00  2.00]

Logits: [2.  1.  0.1]
Softmax probabilities: [0.6590 0.2424 0.0986]
Sum: 1.0
Sharan Initiatives - Making a Difference Together