We covered the introduction to machine learning in the last blog post. Now we will learn about the Deep Learning.

Introduction to Deep Neural Networks

Let's continue with the divide & conquer approach for understanding the terms. We know what deep means and neural is defined as:

Neural = of, relating to or affecting a nerve or the nervous system

But what exactly is a neuron?

Basically, they are the building blocks of a nervous system. There comes the final question, what is network?

Combining these can define neural networks as numerous neurons connected working towards the same goal. Each neuron in this network has a different duty, and all neurons share the information.

This is basically mimicking the nervous system of our brain. If you want to lift your arm, the signal generated in your brain is sent to muscles in your arm via millions of neurons. Or when you see a bird outside, the light hits your eye, and the corresponding part of your brain is activated to recognize the thing you see is a bird vice versa.

So how do neural networks work?

Let's remember the definition of machine learning from the last blog post.

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. - Tom Mitchell, 1997

Assume that we have the task of finding the correct digit in the MNIST dataset. In this case, T will be classifying digits, P will be the accuracy of our model, and the E will be going over the digits again and again as they will be the experience.

Let's dive a bit more into the MNIST classification task and learn how it can be done with neural networks.

MNIST Digit classification example

We talked about digits a lot, but how does a digit look like on the computer??

Let's take a look at a sample 8 from the MNIST dataset. In MNIST, all images are grayscale, and they have sizes of 28x28 pixels. This is how 8 looks like:

Since each image is 28x28 pixels, we use 28x28 = 784 pixel values to represent a single image. If we want to build a neural network for this, the input layer of our network will have 784 neurons, and each of them will read a single pixel value.

Okay, that looks a bit confusing but don't get scared it's not that difficult.

These neurons work independently in the same layer, meaning that each of them will learn something different. However, all neurons (except in the input layer) have access to all outputs of the neurons in the previous layer.

Imagine you are the neuron that comes after the Pixel 8; you are in the 2nd layer. You get all the outputs from the first layer; each of them said something different to you. You look at your information and try to come up with something meaningful while having no information about what is going at the end of the network since this is a ten-layered network. You guess with the information you have and pass that guess to the next layer.

Now, you don't know whether the guess you made is correct because you don't have access to the output layer, which is the only layer with the correct label information. However, you have a connection to all neurons in the 1st and 3rd layers. Whom do you ask about your performance?

Yes, you ask it to the next layer, but that guy also doesn't know the output information, BUT they can ask the next layer. This can continue until the final layer, and then that neuron can say, "Hey guys, we made a mistake, our error was too high!". Once the final layer neurons have this information, they will pass it backwards until the first layer, so all the neurons will know they made a mistake and adjust their output accordingly.

This is called backpropagation, and this is an essential part of how neural network learns with experience.

Some Common Deep Learning Terms

You will hear a lot about these terms, so let's sum them up quickly.

Supervised Learning = You have data, and some expert guy labeled them, and your model can learn knowing that it has access to the correct label. (MNIST classification)

Unsupervised Learning = You have data, but you don't have an expert. Your model needs to go over the data and find a pattern in the data to group them appropriately. (Grouping customers by their shopping behaviors)

Reinforcement Learning = You have independent states and actions. Each action your model takes changes the state. Your model is trained with a reward function (Chess game, autonomous driving, etc.)

Cost Function = Function used to measure the error of the model output.

Overfitting = Your model trained on the training data so much that it performed well in there and made it its comfort zone. But it's too shy to perform well with new data.

Underfitting = Your model used all the data but is still hungry for more data, so it cannot perform well in any situation

Classification = Your model predicts a class/label for each input.

Regression = Your model predicts a value as an output.

Training set = Data available to the model during training (The lecture)

Test set = All the remaining data (The exam)

Validation set = Part of the training set used to tune the model settings (sample exam questions)

Batch size = Number of data points taken from the dataset in each training step.

Epoch = Number of times the model is allowed to go over the whole dataset.

That was a long list, but I hope you have an idea about what these terms are now. This is it for this blog post. We will be talking about the Convolutional neural networks in the next and final blog post. See you there!