Feature Scaling Questions #208

kjiang25 · 2024-12-09T22:04:32Z

Hey, just had a few questions on feature scaling from step 4 of the code.

The first is regarding the impact of scaling/centering the data on Ein. We discussed in class that the RT of PLA only improves when centering the data, and not when scaling the data. I assume that this improvement carries over into the RT of learning algorithms as well. However, when discussing learning algorithms, we stated that while mean centering does not affect dvc and the generalization error, it often improves Ein. I'm curious why there is an improvement in Ein.

I'm also curious when maxabsscaler is applicable since it only scales the data and thus does not affect dvc or RT.

Lastly, I would like to know more about normalizing the data with the L1 and L2 norms. From the scikit documentation, it says that this scales all samples of the data to have a L1/L2 norm of 1. Visually I interpret this as scaling all data to lie on the L1 diamond/L2 circle. Is this interpretation correct, and how does it affect a model's ability to classify those points? I imagine that if my interpretation is correct, a linear classifier would have a higher Ein.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Scaling Questions #208

Feature Scaling Questions #208

kjiang25 commented Dec 9, 2024

Feature Scaling Questions #208

Feature Scaling Questions #208

Comments

kjiang25 commented Dec 9, 2024