pytorch neural network mnist

3 min read 08-12-2024

Building a Neural Network for MNIST Digit Classification with PyTorch

The MNIST dataset, a collection of handwritten digits, is a classic benchmark for machine learning algorithms. This article will guide you through building a simple neural network to classify these digits using PyTorch, a powerful and flexible deep learning framework. We'll cover data loading, model architecture, training, and evaluation.

1. Setting up the Environment

Before we begin, ensure you have PyTorch installed. You can install it using pip:

pip install torch torchvision torchaudio

We'll also need some standard Python libraries:

pip install matplotlib numpy

2. Loading the MNIST Dataset

PyTorch's torchvision library provides convenient functions for loading common datasets, including MNIST.

import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np

# Define transformations (normalize pixel values)
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])

# Load the training and test datasets
trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create data loaders
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)

This code downloads the MNIST dataset, transforms the images into tensors (normalized for better performance), and creates data loaders for efficient batch processing during training.

3. Defining the Neural Network Model

We'll create a simple feedforward neural network with one hidden layer:

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(784, 128)  # Input layer (784 pixels) to hidden layer (128 neurons)
        self.fc2 = nn.Linear(128, 10)   # Hidden layer to output layer (10 digits)

    def forward(self, x):
        x = torch.flatten(x, 1) # Flatten the image tensor
        x = F.relu(self.fc1(x))  # Apply ReLU activation to the hidden layer
        x = self.fc2(x)         # Output layer (no activation for multi-class classification)
        return x

net = Net()

This defines a model with two fully connected layers. The first layer takes the flattened 784 pixel input and maps it to 128 neurons. The ReLU activation function introduces non-linearity. The second layer produces 10 output neurons, representing the probabilities of each digit (0-9).

4. Training the Neural Network

Now, we'll train the model using stochastic gradient descent (SGD):

import torch.optim as optim

criterion = nn.CrossEntropyLoss() # Loss function for multi-class classification
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) # Optimizer

for epoch in range(2):  # Loop over the dataset multiple times
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

This code iterates through the training data, calculates the loss, performs backpropagation, and updates the model's weights. We print the loss every 2000 mini-batches to monitor training progress.

5. Evaluating the Model

Finally, let's evaluate the model's performance on the test set:

correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy of the network on the 10000 test images: {100 * correct // total} %')

This code iterates through the test data, makes predictions, and calculates the accuracy.

This complete example demonstrates a basic PyTorch neural network for MNIST digit classification. You can experiment with different architectures, optimizers, and hyperparameters to improve accuracy. Remember to adjust the number of epochs and learning rate based on your needs and computational resources. This foundation allows you to explore more advanced concepts in deep learning and build upon this example for more complex tasks.