In this fast growing world of technology Machine Learning is one of the hottest trending topics right now. By the year 2025 it is estimated to have 30 billion IoT devices connected around the world. That only means we will have tons of data being generated and distributed over the internet. And this is where Machine Learning comes into the picture.
But wait! What exactly is Machine Learning ?
In a purely technical terms machine learning is a computer program that learns from Experience (E) with respect to some task (T) and some performance measure (P) if its performance on T measured by P improves with Experience (E). – Tom Mitchell 1998
Confusing right? Let me explain it in a different way. Imagine the time when you were learning to drive a car for the first time. You had no idea how or when to change the gears? or how hard you should hit the accelerator ? or How much safety distance you should maintain from the car in front of you? You relied completely on the instructions of the driving instructor who was definitely judging your driving skills.
Well here in this example the Task (T) was to drive the car and your performance measure (P) was the instructor sitting next to you. After a few days of driving you gained some Experience (E) with the controls and now your driving skills are much better than they were on the very first day.
After a month of training your driving instructor looks at you and says, your driving skills (T) have improved a lot with the experience (E) in the last few days as I have noticed (P). Now you are ready to drive all by yourself.
And that’s exactly how Machine Learning works. The machine learning model takes the data and depending on how the model learns from the data it gives you an output. The output is then compared with a performance metric and we are able to define the accuracy of our ML Model.
So what are the Learning Methods ?
Learning methods are the techniques as to how the model learns about the patterns in the data. The most common learning techniques are:
Supervised Machine Learning
In supervised learning, the data being fed to the algorithm includes the desired solution. The training data-set includes the labeled inputs and the correct outputs which allows the model to learn over time. The accuracy of the model is measured over time and changes in the model are done until the error is sufficiently minimized.
Supervised learning is further divided in two categories:
1. Classification
Classification algorithms are used to categories the test data in different categories. The model learns from the labelled data-set and when new data is presented it tries to label the new data based on patterns the model learned during training.
A simple example will be an email spam filter where each email is labelled as spam or not spam. when a new email arrives the model classifies the email as spam or sends it to your inbox based on previous experiences.
2. Regression
Regression algorithms are used to understand the relationship between dependent and independent variables. Regression algorithms are commonly used to makes predictions.
One simple example will be to predict the house prices in a neighbourhood. The prices will depend on many factors such as crime rate, availability of public transport, distance from the city centre, number of room etc. the regression model learns the relationship between these variables and makes a prediction for the house prices.
Unsupervised Machine Learning
In unsupervised learning, the model analyses the hidden patterns or internal structure of data without any prior knowledge of labels or outputs. A simple example will be clustering a group of employees based on their pay scale.
There are many clustering algorithms available and they will be discussed in more detail in upcoming articles.
Semi-Supervised Machine Learning
When algorithm has to deal with partially labelled data or a lot of unlabeled data semi-supervised learning tries to improve the prediction accuracy using both labelled and unlabeled data. It is expected to perform better in such cases instead of using just one of the methods. Usually, semi-supervised learning is chosen when the acquired labelled data requires skilled and relevant resources in order to train it / learn from it. Otherwise, acquiring unlabeled data generally doesn’t require additional resources.
Reinforcement Machine Learning
In a reinforcement learning technique, the model is rewarded for making a good prediction and penalized for every wrong prediction. The model learns from the mistakes and corrects them over time. This learning method allows the algorithm to automatically determine the ideal behaviour in order to improve the performance
So now you have a basic understanding of what is machine learning and what are the learning methods that we can use. The picture below shows in a graphical manner how the field of machine learning is divided in different categories.
But there are still some questions that you should think about such as when to use supervised learning vs when to use unsupervised learning and what algorithm to chose from. We will learn more about these algorithms and answer these questions in the next article.