Few-shot learning allows AI models to learn from a very small number of examples, mimicking the human ability to recognize patterns with minimal exposure. This project explores few-shot character recognition using a neural network approach.
The Omniglot dataset is used for training (a collection of 19280 images of 964 characters from 30 alphabets. There are 20 images for each of the 964 characters in the dataset).
We follow the meta-learning approach, i.e., the network learns how to learn new tasks using multiple training examples of tasks. The meta-learning is performed using episodic training; in each episode, one training batch is processed.
The embedding CNN is a convolutional neural network whose aim is to embed images into a lower-dimensional space. The lower euclidean distance between the point to be labeled and the classes define the prediction.
The interactive demo below demonstrates how the few-shot learning model works. Draw characters in the top row to define your classes. Then draw characters to be classified in the bottom row and click "RUN THE MODEL" to see the predictions.
Each character will be assigned to one of the classes based on the lowest Euclidean distance in the embedding space. The border color indicates the predicted class.
Draw in the upper canvas the target characters (classes)
Draw in the canvas below the characters to be labeled
The game above was implemented by exporting the PyTorch model as an ONNX model (runnable on JavaScript with the ONNX.js package). The first line of canvases defines the classes, while the second line is for the characters to be labeled. The prediction is done by finding the lowest Euclidean distance between the 64-dimensional array of the character and the 64-dimensional arrays of the classes.