Intro to Nearest Neighbour Classifier with MNIST Dataset

What Is This Tutorial About?

Imagine you’re shown a picture of a handwritten number — say a messy, wobbly-looking “3” — and asked to figure out what number it is. For a human, that’s usually easy. But for a computer? Not so simple.

In this lesson, we’ll explore how a machine can learn to recognize handwritten digits using a classic and surprisingly intuitive approach called the Nearest Neighbour classifier. Along the way, we’ll introduce some key concepts in machine learning such as:

How to compare and classify using distance
What a classification problem is
The idea of training vs test data
How computers “see” images as numbers

Let’s Start with the Problem: Recognising Digits

We want the machine to do what your eyes and brain do naturally: look at an image and decide what digit (0–9) it shows.

Here are some example images from a famous dataset used for this task. Each one is a 28×28 pixel grayscale image of a digit:

[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ]
[ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ]

Some are clean and easy to recognize. Others are messier, like scribbles on a sticky note. Still, we want the machine to make an accurate guess — just like you would.

A Smarter Idea: Let the Machine Learn

Instead of trying to write rules ourselves, we can let the data teach the computer what digits look like.

We use a method from machine learning called Nearest Neighbour classification. Here’s the idea:

Show the computer thousands of examples of handwritten digits, each labeled correctly.
When a new image appears, the machine:
- Looks through the known examples
- Finds the most similar one
- Says: “This new image looks closest to this ‘5’, so I think it’s also a ‘5’.”

That’s it. No rules about loops or lines — just similarity.

How Images Become Data

To a computer, an image is not a picture — it’s a grid of numbers.

Each 28×28 image has 784 pixels. Each pixel is a number from 0 (black) to 255 (white). So a handwritten “3” might look like:

[ 0, 0, 255, 128, … ] → a vector with 784 numbers

This long list of numbers is how the computer “sees” the image.

Comparing Images: Measuring Distance

Once images are turned into vectors of numbers, we can compare them using a mathematical formula. The most common is Euclidean distance — think of it as the length of a straight line between two points in space.

Even though we’re in 784-dimensional space (not just 2D), the concept is the same:

Subtract the numbers
Square the differences
Add them up
Take the square root

The smaller the result, the more similar the two images.

The Nearest Neighbour Classifier (1-NN)

The 1-NN method is simple:

Store all labeled training images in memory.
When a new image comes in:
- Compare it to every image in the training set.
- Find the one that’s closest (smallest distance).
- Copy its label as the prediction.

It’s called “lazy learning” because it doesn’t do anything until it’s asked to classify a new point.

How Well Does It Work?

Let’s test it!

The MNIST dataset gives us:

60,000 training images
10,000 test images

When we run 1-NN on the test set, we get a test error of 3.09%. That means 96.91% of the time, the model guesses correctly — pretty impressive for such a simple approach!

By contrast, a random guesser (just picking digits blindly) would have a 90% error rate.

Strengths and Weaknesses

Strengths:

No training required — just store and compare
Intuitive and easy to implement
Works well on clean, well-labeled data

Weaknesses:

Slow with large datasets (needs to check every example)
No learning: it doesn’t understand anything, just memorizes
Sensitive to irrelevant features or noise

Wrapping Up

The Nearest Neighbour classifier is a great first step in machine learning. It teaches us:

How to turn images into numerical data
How to define similarity using distance
Why we need training and test data to evaluate performance

But most importantly, it shows that machines can learn from data — even without complex rules.

The Nearest Neighbour algorithm stands as a powerful yet intuitive introduction to classification in machine learning. Its strengths lie in simplicity and direct application, particularly with datasets like MNIST. However, its limitations — especially in scalability and generalisation — highlight the need for more sophisticated models. As machine learning continues to evolve, understanding these foundational methods remains essential for building toward more advanced solutions.