Types of learning algorithms. [Machine Learning]
Supervised Learning
Regression
Let’s start by discussing the most familiar problem of ML. Suppose I have some data, the size of the house and its bargain.
If I plot this dataset I will see a graph like below.
Dataset:
This problem has been photographed from Andrew Ng's machine learning course resource. Is used directly.
Problem:
I was asked to find out with the above dataset,
If the size of your friend's house is 750 square feet, what is the price?
Solution:
If I can find an equation in which Corresponding Price is found by setting Area, that means
y = f (x)
Or,
Price = f (Area)
That means we need to find out what the f () function is. I will not say here how to find f ().
The information we get from the problem
Here we are giving our algorithm a dataset where the 'correct answer' is given. (From graph)
That means we know the real price of a certain size house
By feeding this data into algorithms we can teach him that a house of this size is worth so much. This is called Training Data.
Now based on this Training Data, we can know the price of a house of a size that was not in the Training Data. For example, if I want to know the price of a house of 3000 sq ft, but it is not in the dataset! But the model I made can predict the price of a 3000 sq ft house based on previous experience.
This problem is part of the Regression problem
Because we are trying to infer a new value from the previously used value. The next example will be clear.
Classification
Now let's look at another type of dataset, where it is stated along with the size of the tumor whether the tumor of that size is deadly, malignant, or malignant.
Here 1 is taken as yes and 0 as no.
Common sense suggests that if the tumor is large in size, it is more likely to be fatal.
But datasets show that some tumors may be large in size but not fatal. Again, some small tumors can be fatal
Problem: We can create a predictive model that can tell if a tumor is fatal (based on size).
Information obtained from datasets
This is a classification problem because we want to give Tumor Size in the input and Yes / No type answer in the output. We don't want any value, like we wanted a value in the case of house price, yes / no type answer was not acceptable. So that's the regression problem
The data has to be divided into two parts, there will be nothing in the middle, we can tag 1 as Malignant and 0 as Not Malignant.
If it can be plotted differently
Here we are trying to say using only one parameter (Size of Tumor) whether it is fatal or not? In fact, there is no such thing as one type of input, there may be many more inputs. Such as Age vs Tumor size
If the input parameter is more




Comments
Post a Comment