Neural Network Digit Detector

Machine learning: probably one of the most intimidating subjects when you’re on the outside looking in. If you have ever wanted to delve into ML but have always been intimidated by the nonsensical looking symbols and mathematics at the heart of ML trust me you are not alone. However, as someone who has taken the plunge, I want to be very transparent that that ML is not hard, in fact, once you understand the core principles behind how machines learn, you will come to see there is no magic in machine learning, just a lot and a lot of calculus (and matrices). That is why I promise that all you need to start ML is a high school level education in mathematics, and a couple google searches on what matrices are and how to multiply them.

Now, while I was still learning, I did what any curious student would do, and that is to google for tutorials for ML. Now, A LOT of these tutorials just black box various ML concepts, so all you really learn to do is how to import packages and use them, without really learning about what is going on behind the scenes. As such, for my entry level project, I wanted to create something which would have no ML dependencies and functioned on a network built from the ground up by me, and this project is that, and I am very proud of it, let’s take a look!

Project Description:

This project is a GUI in which users can train an instance of my machine learning algorithm to identify whether a 28x28 pixel image is of a 1 or a 0. While the machine is training, the user can actually look at the training data being used to train the machine, the activation values of each training example (how confident the machine is with its answer), and also the cost (how wrong the machine was with its guess) associated with that specific training example, and if none of that made sense I’ll explain some of it later on. After the machine is doing training, the user can then test the instance against some testing data and can see the actual test data, and the machine’s guess. Moreover, the user can also draw their own test image to personally test the machine.

⠀⠀⠀⠀⠀Figure 1: The training sequence⠀⠀⠀⠀⠀⠀⠀⠀⠀Figure 2: User testing AI instance

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀Figure 3: The drawing board

To access my project you may download it from my Github repository available here, or for those of you who prefer to use git you can execute the following in bash:

git clone https://github.com/LangLazy/GUI-0-or-1-nueral-network

Then just run “gui.py” and see how your instance of the AI fairs against the training data. On average it usually comes out to around ~95% accuracy, but it has gone as high as 99% and as low as 93%. I will discuss areas of improvement for my work later on.

The used dataset

I used the MNIST dataset for training my AI. This dataset is a very large dataset of 28x28 pixel images containing handwritten digits, written by people of varying ages and professions. If you want to use the dataset it is available for free here.

Dependancies:

bUt yOU sAId yoU hAD No dEPenDeNcIES. The project has no dependencies on ML libraries such as Pandas or Keras. However, in order to power the GUI aspect of the program, and to interpret the data being used, I did have to rely on the following:

import graphics
import mnist
import matplotlib.pyplot

If you do not have these installed you can do so by running the following in the command line:

pip install Name

(Of course you need to have python installed, and also have pip in your PATH variable)

Structure

There are many models of machine learning, such as K-Nearest neighbors, support vector machines, random forests, and many others. However, the one which fascinated me the most was Neural Networks. As such, I decided to code a Neural Network for my AI. Since I was not classifying all of the digits, only 1 or 0, a very simple neural network could be used. Thus, I decided to make a 3 layer neural network.

The first layer would consist of 784 input neurons, each neuron here would correspond to 1 pixel in the original 28x28 image used as input (note we have 784, as 28*28 = 784). The second layer consisted of 5 neurons. A structural diagram can be seen below.

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀Figure 4: The nueral network structure

This really was an arbitrary choice, and since I defined it as a parameter, you could alter this variable if you would like to. I will address later on the effects of adjusting these hyperparameters and how they could potentially lead to accuracy improvements for the overall classifying algorithm. The final layer only had 1 neuron. If you know the basics of Neural Networks, you will know that Neurons can have a value between 1 and 0 (included), which is referred to as the Neuron’s activation value. As such, I used the activation of this last neuron to represent the output of the machine. With activations being lower than .5 representing the machine choosing the value of 0, and any value higher than (and including) .5 would represent 1.

Challenges

Now as I said previously, this was my first ML project, and I was not using any ML libraries, so as you could imagine, I ran into many many many problems. However, if you understand the math behind how the machine “learns”, you will know all it is, is just fine-tuning a large number of parameters, to the point where the result is the machine solving the proposed problem. So the main problems I faced really came down to how I could keep effectively track of all of the derivatives and matrices used in calculating the corresponding weights and biases of all of the neurons used in my neural network. Overall, just a lot of planning was able to prevent the majority of issues I faced. ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀Figure 5: Behold my scribbles

Areas of Improvement

Although I am very proud of my program, there are several places I want to improve on. The first being the selection of digits. Currently, my program can only recognize 1 or 0, as such I want to extend this to all of the digits from 0-9, and eventually for it to be able to read any combination of digits. Another large area of improvement is the optimization of hyperparameters. Hyperparameters are just global variables that control certain aspects of the structure of the neural network or the overall learning process, examples include the number of neuron layers used, number of intermediary neurons, learning rate, etc. As of current I just have arbitrarily chosen hyperparameters, however, in order to truly optimize one’s algorithm, you need to run tests to see how to fine-tune your hyperparameters to yield the best results. I also am by no means a UI expert and I think you can kinda see this in the UI I made. There are some strange graphical bugs, such as overlaying between scenes and the like which I still have to address.

Next steps:

This project taught me a lot, not just in regards to my understanding of ML concepts, but also in my ability to code flexible and dynamic programs in python, and also how to handle multivariable calculus (big words!). I am by no means an expert now, but I definitely have taken a big step forward. This project showed me that there really is a lot to learn about the field of ML, and I have many future plans. I think the first would be to do a similar project with different types of ML algorithms such as Support Vector Machines and Random Forrests. However, I still am not sure, but I guess that means I will see you soon, see you then!