Homepage | Course content | Spanish |
Last class we talked about how we can use linear regression to model linear relationships between features and a target .
But not all relationships between features and targets are linear. Sometimes the relationship is categorical (i.e. different instruments, or different musical genres).
You can use the logistic regression formula to find a vector and a bias term that allow you to transform the features into values between and .
The logistic regression formula is , where .
Once we have transformed our features into , we can define a threshold (usually 0.5
) under (above) which all values in will be treated as zeros (ones).
With this procedure, we can assess the performance of our logistic regression model against the ground truth data .
Last class, when we optimized linear regression, we used the function .
For logistic regression we must use the binary cross-entropy loss, which is defined by (the origins of this function come from statistics. If you are curious, you should take or review the materials for an introductory machine learning class, like Stanford’s CS229).
Inspecting the binary cross-entropy loss, you can see that when , . In contrast, when , .
When minimizing the binary cross-entropy loss using an algorithm like gradient descent, what we are effectively doing is making and as simiar to each other as possible.
Question: why does the binary cross-entropy loss have a negative sign at the beginning?
When we are done optimizing our logistic regression model, we must evaluate it using our validation data splits (also, remember the evaluation data)?
Due Mar 8th at 11:59PM (Eastern Standard Time)
© Iran R. Roman 2022