Perceptron: Building block of a neural network

3 minute read

Perceptron mimics a neuron

A perceptron kind of looks and functions like a neuron in the brain.

The way to create a neural network is by concatinating these perceptrons, so essentially we are mimicing the way the brain connects neurons by taking the output from one and turning it into input for another.

Logical operators as Perceptrons

To understand how a neural network training works, we will first see a very simple example by using a Perceptron to represent a AND logical operator.

Here, we have the AND operator, which as we know takes two inputs and it returns an output. The inputs of a logical AND operator are True or False, and the output is based on the Truth table.

A perceptron has a line that is defined by weights and bias and it has a positive area (blue) and a negative area (red). What this perceptron is going to do is to plot each point, and if the point falls in the positive area, then it returns a 1, and if the point falls in the negative area, it returns a 0.

Similarly, we can construct an OR operator. The difference in the OR and AND perceptron is where the line is drawn. That is, the line has a different set of weights and bias.

So how do we get the weights and bias for these lines?

Example: Manually tweak the weights and bias for the AND perceptron

Adjust the weight1, weight2 and the bias values until you simulate an AND operator.

  • For example, when weight1 = 1 and weight2 = 1 and bias = -1.5, the linear combination representing the line will be negative when the inputs are (0,0) (0,1) (1,0) which results in the activation function returning a 0. Likewise, when the inputs are (1, 1) the linear combination will be positive and the activation function returns a 1.

The bottom line is that we had to use trial and error to adjust the weights and bias until we got to simulate how an AND operator works.

The goal of training a neural network is to find these weights and biases through some algorithms, which we will learn shortly.

import pandas as pd

# TODO: Set weight1, weight2, and bias
weight1 = 1.0
weight2 = 1.0
bias = -1.5


# DON'T CHANGE ANYTHING BELOW
# Inputs and outputs
test_inputs = [(0, 0), (0, 1), (1, 0), (1, 1)]
correct_outputs = [False, False, False, True]
outputs = []

# Generate and check output
for test_input, correct_output in zip(test_inputs, correct_outputs):
    linear_combination = weight1 * test_input[0] + weight2 * test_input[1] + bias
    output = int(linear_combination >= 0)
    is_correct_string = 'Yes' if output == correct_output else 'No'
    outputs.append([test_input[0], test_input[1], linear_combination, output, is_correct_string])

# Print output
num_wrong = len([output[4] for output in outputs if output[4] == 'No'])
output_frame = pd.DataFrame(outputs, columns=['Input 1', '  Input 2', '  Linear Combination', '  Activation Output', '  Is Correct'])
if not num_wrong:
    print('Nice!  You got it all correct.\n')
else:
    print('You got {} wrong.  Keep trying!\n'.format(num_wrong))
print(output_frame.to_string(index=False))

Nice!  You got it all correct.

 Input 1    Input 2    Linear Combination    Activation Output   Is Correct
       0          0                  -1.5                    0          Yes
       0          1                  -0.5                    0          Yes
       1          0                  -0.5                    0          Yes
       1          1                   0.5                    1          Yes

How the perceptron learns these weights and bias?

The main question we now have is how does the perceptron know the weights and bias of the line that best divides the points? Here’s how it does it:

  • We start off with a random line, i.e random weights and bias.
  • Now for every mis-classified point, we update (+/-) the weights and bias of the line using a small fraction called the learning rate.
  • We then repeat this process until all the points are correctly classified, or until we are satisfied with the predictions or we could even specify how many times to repeat this step using a parameter called epoch.

OK, what if data cannot be divided by a straight line?

Unfortunately, our perceptron algorithm that we discussed above won’t work in this case. We need to redefine our perceptron algorithm for a line in a way that it will generalize to other types of curves.

Enter Error functions

So, the way we will have to solve the above problem is with the help of an Error Function.

An error function is simply something that tells us how far we are from the solution.

We will look at different error functions next.