Another type of supervised problem that we encounter is the classification problem. The response or the target variable here is categorical in nature or we can say that we have a qualitative response. This just means that for any given input the output can take only a predefined set of values. For example, if we have a medical report of a patient and they can have one of three types of diseases, then we can convert this into a classification problem. The independent variables would be the details of the patient which might include the symptoms, their age, sex, etc.
But the question arises here, why can’t we just use regression here?
For our example, given input X we will predict Y. Y can take three values and in reality, we will have to encode these values, Disease A = 1, Disease B = 2, Disease C = 3.
If we use regression here, the algorithm will interpret that Disease B is smaller than C and larger than A which is not true, the order does not matter here since these are qualitative responses hence making regression a wrong choice.
However, there are some classification algorithms that have been derived from regression algorithms, one of them being logistic regression. It is used in binary classification problems, like predicting if the patient has Disease A or not.
The basic idea is simple, first, we encode the qualitative response, Disease A not present = 0, Disease A present = 1. Then we use linear regression to predict this value, if the value is greater than 0.5 we predict 1 else 0.
The only problem is that linear regression might predict values that are smaller than 0 and greater than 1 sometimes, and to tackle that we just use a function on the output of linear regression which squishes the values between 0 and 1. This function is called the sigmoid function.
That in essence is logistic regression. Just like linear regression, it is the most simple yet the most intuitive algorithm out there. There are many other algorithms that are used for classification, K Nearest Neighbours, Random Forest, Support Vector Machines, Neural Networks, etc.
You must have noticed that some of these algorithms were also mentioned in the linear regression section. Most of the algorithms used in practice can be used in both regression and classification problems with just a few minor changes.