Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Scaling Questions #208

Open
kjiang25 opened this issue Dec 9, 2024 · 0 comments
Open

Feature Scaling Questions #208

kjiang25 opened this issue Dec 9, 2024 · 0 comments

Comments

@kjiang25
Copy link

kjiang25 commented Dec 9, 2024

Hey, just had a few questions on feature scaling from step 4 of the code.

The first is regarding the impact of scaling/centering the data on Ein. We discussed in class that the RT of PLA only improves when centering the data, and not when scaling the data. I assume that this improvement carries over into the RT of learning algorithms as well. However, when discussing learning algorithms, we stated that while mean centering does not affect dvc and the generalization error, it often improves Ein. I'm curious why there is an improvement in Ein.

I'm also curious when maxabsscaler is applicable since it only scales the data and thus does not affect dvc or RT.

Lastly, I would like to know more about normalizing the data with the L1 and L2 norms. From the scikit documentation, it says that this scales all samples of the data to have a L1/L2 norm of 1. Visually I interpret this as scaling all data to lie on the L1 diamond/L2 circle. Is this interpretation correct, and how does it affect a model's ability to classify those points? I imagine that if my interpretation is correct, a linear classifier would have a higher Ein.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant