Regression (Linear and Logistic)
Linear Regression is a way to draw a straight line through data points so you can predict something that’s a number. Let’s say you want to guess someone’s weight based on their height. You plot out a bunch of people’s heights and weights, then draw the best line that goes through them. From that, you can kind of “read off” what weight goes with a new height.
An example where this works is predicting sales from advertising. If a company spends more money on ads and usually sees sales go up, linear regression can help figure out about how much sales will increase if they spend a certain amount.
Logistic regression is different. Instead of predicting numbers, it’s about predicting categories, usually yes or no. For example, is an email spam or not spam? It looks at the data, fits it kind of like a linear regression at first, but then squishes everything into a probability between zero and one. If it’s close to one, that means yes, if it’s close to zero, that means no.
Now the bigger picture: classification vs regression. Classification is when you’re sorting things into groups, like spam or not spam, cat or dog, pass or fail. That’s what Project 2 is focused on. Regression is about predicting continuous values, like test scores, salaries, or house prices. That’s going to be more like Project 3. The difference between linear and logistic regression is pretty simple. Linear regression predicts continuous numbers with a line. Logistic regression uses that line but then transforms it so you can make yes or no predictions.
In linear regression, theres something called Sum of Squared Errors (SSE). That’s just a way of measuring how far off your predictions are. The line of best fit is the line that keeps those errors as small as possible. A cost function is what keeps score of how good or bad the line is, and gradient descent is the algorithm that helps you adjust the line little by little until it’s as accurate as possible.
With logistic regression, you’ll hear about log odds. It’s just a math trick that helps turn probabilities into something the model can work with. The big move logistic regression makes is running a linear regression first, then transforming it with a sigmoid function. That’s what forces the output into that zero-to-one probability range, which is why it works for classification problems.
