This project implements a system to recognize handwritten digits from images using OpenCV and Python. It utilizes the K-Nearest Neighbors (KNN) algorithm for machine learning with automated manual preparation and labeling of the training dataset derived from a single image of digits. The application is capable of preprocessing images extracting features and classifying digits providing a foundational approach for further exploration into optical character recognition (OCR).
Here're some of the project's best features:
- Custom Dataset Preparation: Extracts and labels data from a grid of handwritten digits.
- Digit Recognition: Identifies digits from 0 to 9 in images using the trained KNN model.
- Image Preprocessing: Includes grayscale conversion noise reduction edge detection and contour sorting.
- Accuracy Evaluation: Computes the accuracy of the model based on test data.
- Visualization: Displays images during various stages of preprocessing and recognition to help in debugging and understanding the process.
Technologies used in the project:
- Python: Primary programming language.
- OpenCV: Utilized for image processing and implementation of the KNN algorithm.
- NumPy: Manages data in arrays for efficient computation and handling.
The system is structured into three main components:
data_train.py: Manages the loading of a digit image, splits it into individual digit cells, labels them, and divides them into training and testing sets. This script also trains the KNN model using the prepared data.func.py: Provides utility functions for image manipulation such as making images square, resizing, and calculating centroid coordinates for sorting contours.main.py: Serves as the entry point of the program, processing new images for digit recognition using the trained model and utility functions.



