Supervised Learning - Classification

print("Supervised Learning - Classification")

Classification in supervised learning uses discrete classes to make predictions. This uses qualitative feature vectors.

Let me break this down for you 👽. Classification here basically means grouping items. You just got back from shopping? Clothes go in the wardrobe, drinks in the refrigerator, seasoning in the kitchen cabinets right? So you just grouped all the items you bought from your shopping.

Now in classification, we want to be able to determine why we placed each item in each group so that we can use that information to predict the group a new item should be in. So let's say after your shopping you receive an Amazon parcel the next day with a gift from a close friend containing a Seiko 5 Sport SRPD79 wristwatch ⌚️, where do you keep it? In the wardrobe right? but why? Because the wardrobe is where you keep your clothing and jewelry 🤗. Whew, my breakdown got a bit lengthy.

This whole illustration above clearly explains what classification in supervised learning is about. We want to be able to get a list of items and group them based on their features or characteristics(another big word lol). Not just that but we want to be able to predict the group a new item should belong to, based on its characteristics.

In an earlier chapter, we talked about qualitative features and discrete classes which are categorical data, that is data in groups. You can check it out here to refresh your memory.

I believe we now understand classification in supervised learning and what we hope to achieve with it. So now let's talk about types of classification.

There are two types of classification:

Binary classification: This is a type of classification based on two classes only. This is used where you have just two classes to predict. For example, yes or no, spam or not spam, or hotdog and not hotdog 🙂.

This model only needs to know one thing, so it can predict that correctly. For example, if we want our model to predict hotdog or not hotdog, we'll need to train the model to know what a hotdog is, anything that does not meet a hotdog's criteria would be flagged as not a hotdog.
If you've watched the movie series Silicon Valley you'll remember Jian Yang's hot dog app 😁, it was hilarious, you can watch a clip of it here. That type of app uses a binary classification model.
Multiclass classification: This is based on classification with two or more classes. Example hotdog/pizza/ice cream and orange/apple/mango/pear. This model needs to know several things, so it can predict them correctly.

In this classification, the model has to be trained on all the classes, unlike binary classification which requires only one.

In the case of Jian Yang, he was expected to use a multiclass classifier so that his app could recognize different types of food not just hotdogs and not hotdogs. But Jian thought it'd be tedious and boring to train a model on all the types of food in the world.

Now before we start training our own models there is one more thing I think we need to know, precision and recall.

Precision and Recall

Precision and recall are performance metrics used in classification tasks to measure the performance of models.

Consider Jian Yang's image recognition app for recognizing hot dogs. Now let's say upon use for a list of images that contain 8 hotdogs and 10 pizza slices, the program identifies 6 hot dogs. Of the 6 hotdogs, only 4 are actually hot dogs (True Positives), while the other 2 are pizza slices (False Positives). 2 hotdogs were missed (False Negatives), and 8 pizza slices were correctly excluded (True Negatives).

False Negatives (FN) - Negatives labeled as Positives
True Positives (TP) - Positives labeled as Positives
True Negatives (TN) - Negatives labeled as Negatives
False Positives (TP) - Positives labeled as Negatives

It is important that we start with the illustration above because precision and recall are based on these.

Precision in classification tasks is the number of True Positives divided by the total number of elements labeled as belonging to the positive class. This is the fraction of relevant instances among the retrieved instances.

The formula looks like this;

Precision = TP / TP + FP

Using our illustration above let's try to calculate the Precision.

True Positives (TP) = 4

False Positives (FP) = 2

Precision = 4 / 4 + 3 = 0.6667

Recall in classification tasks is the number of True Positives divided by the total number of elements that actually belong to the positive class. This is the fraction of relevant instances that were retrieved.

The formula looks like this;

Recall = TP / TP + FN

Using our illustration above let's try to calculate Recall as well.

True Positives (TP) = 4

False Negatives (FN) = 2

Recall = 4 / 4 + 2 = 0.6667

Precision and recall are not useful metrics when used in isolation, so we always need to use them together to get a good picture of our model performance. The metrics figure ranges from 0 - 1, where 0 is nothing at all, while 1 means a perfect score.

Now that we know about precision and recall let's round up by talking about one last thing in classification tasks, accuracy.

Accuracy

Accuracy is the proportion of correct predictions (both True Positives and True Negatives) among the total number of cases examined. This simply means, how accurate is the model? from the name too.

The formula looks like this;

Or in more simple terms;

accuracy = correct classifications/ all classifications

Still using the illustration of Jian Yang's not hot dog app, let's try to calculate the accuracy of our model.

TP = 4

TN = 8

FP = 3

FN = 2

Accuracy = 4 + 8 / 4 + 8 + 2 + 2 = 12 / 17 = 0.75

Now that we know about precision, recall, and accuracy. I think we know enough now to train our own model 🤖.

There are a lot of classification models and more are created as you read this so we can't talk about all, but we'll talk about a few popular ones and train our own models with them;

The classification models we'll talk about are:

K-Nearest Neighbors
Naive Bayes
Logistic Regression
Support Vector Machines (SVM)
Random Forests

Up next: K-Nearest Neighbors. This is where we'll finally train and deploy our first AI model 🤗🤖🙃👽 subscribe and stay tuned!

⬅️ Previous Chapter

Next Chapter ➡️

Supervised Learning - Classification

AI Series - Chapter 4