the xor classification problem

When given two binary inputs, a neural network should return 1 if the two inputs are not equal, and a 0 if the two inputs are equal.

Making a neural network learn the XOR function is a classic ai problem. (there are dozens of articles explaining this exact problem). XOR and XNOR are more complicated than any other logic gate because they are not linearly separable.

interactive simulation

This is a neural network that uses a truth table as training data. After it has trained, it will converge on an algorithm which can accurately implement the desired boolean algebra function.

A	B	NN
0	0	. . .
0	1	. . .
1	0	. . .
1	1	. . .

Click on the outputs in the 'f' column to change them.

After clicking start, the network will output values that (should) closely match the desired 'f' column. The accuracy of the neural network is measured by a cost function, which is minimized over the many iterations.

a basic neural network

A neural network consists of many nodes, which are all interconnected. Here's the neural network structure that I used above:

Photo by researchgate.com

The output of each node is a weighted sum of its inputs, plus a bias value:

a = Σwx + b

where a is the activation, w is the weight, x is the input, and b is the bias. With only 3 neurons, the network already has 6 weights and 3 biases. These values are initially randomized, meaning the network produces very inaccurate results to begin with.

Adjusting all of these values slightly with a process called backpropagation allows the network to find weights and biases that minimize it's cost (how wrong its prediction is). By repeating this process thousands of times, the network learns to makes better predictions.

problems i ran into

reliability

I struggled to make the network reliable: sometimes it would be completely wrong despite how many iterations I trained it for. Apparently this is because there are multiple local minimums that the network can get stuck in without being able to improve further.

I solved this by initializing the weights randomly with values from 0.5 to 1, instead of 0 to 1. Doing this greatly increased the chances that it would end up in the correct local minimum. (is that cheating ??)

performance

The first implementation I made would require 10 or 20 thousand iterations before becoming accurate.

I read here that using the tanh activation function rather than the sigmoid function lets the network learn faster. I also used web workers to put all of the computation on a separate thread and increased the number of hidden nodes from 2 to 3. After making those changes, my network can reach an acceptable accuracy within 2 to 4 thousand iterations.

thoughts

Throughout this project I learned some things:

how to design a responsive, mobile-friendly website that looks good

how to use javascript (its nice but can get messy sometimes?)

the basics of neural networks, especially backpropagation calculus which took forever to understand

resources i used

3blue1brown's youtube series on neural networks

An article by Siddhartha Dutta about backpropagation

Daniel Shiffman's youtube series on neural networks

This paper by Richard Bland about the XOR problem

Aaron Lee, 2021