print("Introduction to ML")
Machine Learning(ML) is a subdomain of Artificial Intelligence(AI) in computer science that focuses on algorithms that help a computer learn from data without explicit programming.
What this means is that instead of writing code to solve each problem, the computer should be able to learn from solutions from previous problems so it can solve new problems it hasn't seen before, from experience.
That is a major step in AI's quest to become as good as humans, to learn. We want our computer to be able to differentiate between objects and people it has never seen before, we want it to be able to make accurate predictions using empirical data, we want it to understand distance and depth in our autonomous vehicles, etc. We want it to do all these things from experience, so we don't need to write code for each new scenario.
But how do we teach a computer to learn, so instead of garbage in garbage out, we can get model in experience out?
We will need to convert what we want to teach to a language the computer understands best, numbers. To teach the computer to recognize faces we'll need to train it by showing it many face images, represented in numbers. The computer would then recognize patterns in the number representations of the images to recognize faces. This brings us to AI models.
AI Models
AI Models are algorithms or mathematical representations that are used to learn patterns from data and make predictions or decisions. This simply means models are different ways or techniques to teach a computer.
Algorithms as mentioned above simply mean a predefined way of solving a problem. A good analogy would be to compare it to recipes. We use a recipe's step-by-step instructions say; turn on the cooker, then the pot, wait 5 minutes here, wait 10 there, and so on, until our meal is ready. So also with algorithms, we convert input(data) using specific instructions to our desired information. Just as recipes algorithms can be simple and complex depending on what we intend to achieve.
Foundation Models are base models that can be fine-tuned or adapted to specialized tasks. This rapidly speeds up AI model development since the AI model would not need to be built from scratch every time but can use the Foundation model as a base to train on specialized tasks. There are thousands of open-source foundation models on hugging face alone. Most of the AI models used right now are trained from foundation models. For example, GPT is the foundation model of ChatGPT.
Here is a good analogy to describe AI models and Foundation models; In a printing press, there would be different types of papers for different purposes such as book covers, ID cards, and so on. We can say those papers are Foundation models. Now when a writer wants to print a new book with a publisher the printing press doesn't need to go to the forest to cut down trees to start creating papers, they can use the papers available as appropriate which would make their work faster. Now the printed book is the AI model. I hope with these few points of mine I have... ๐.
Now that we know what AI Models and Foundation Models are, how do we create one?
Let's talk about the 5 steps to create an AI Model:
Data Preparation: The first step in creating an AI Model is to prepare the data. Below are the steps in preparing data:
Data processing: This involves the categorization and organization of raw data to the desired structure or shape.
Filter: These involve things we want to remove like copyrighted material, hate speech, or some form of bias on sensitive topics.
Duplicates: As the name implies, this step removes duplicate data.
The end product of data preparation is called a Base Data Pile.
Training: This step takes the data pile and uses a model to train it. Below are the steps in training models:
Select model: First we need to determine the type of model based on what we want to achieve. For example, we can use a Classifier for face recognition, a Large Language Model (LLM) for a chatbot, a Regressor to predict weather temperature, etc.
Tokenize: The data pile as text is broken down into tokens which can be individual words or alphabets. For example, the word "tokenization" would be tokenized into a list of alphabets like so ["t", "o", "k", "e", "n", "i", "z", "a", "t", "i", "o", "n"]. This step is used for text data mostly.
Training: This is where the actual training of the model happens. This can take a few minutes to months depending on the model and quantity of data pile.
The end product of training is a trained model.
Validation: Here we determine the performance of our trained model. A model card is usually used to record this.
A model card contains the model and its benchmark scores based on its performance as required.
Tuning: This step is where we fine-tune the trained model, this would involve training on a smaller dataset and some adjustments to fit the requirement.
Deployment: The trained model can be deployed to a public cloud like AWS or Digital Ocean. It can also be embedded in an application. Now it's ready for use!
There are several subdomains and subsets under Machine Learning but we'll group ours into 5 subsets:
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement Learning
Deep Learning
Phew! We've tried ๐ฅฒ.
One of the images in this chapter is AI-generated, see if you can find it ๐, and let me know in the comments.
It'll only keep getting more interesting as we progress ๐ฝ, our next chapter is Supervised Learning. See you!