You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey, just had a few questions on feature scaling from step 4 of the code.
The first is regarding the impact of scaling/centering the data on Ein. We discussed in class that the RT of PLA only improves when centering the data, and not when scaling the data. I assume that this improvement carries over into the RT of learning algorithms as well. However, when discussing learning algorithms, we stated that while mean centering does not affect dvc and the generalization error, it often improves Ein. I'm curious why there is an improvement in Ein.
I'm also curious when maxabsscaler is applicable since it only scales the data and thus does not affect dvc or RT.
Lastly, I would like to know more about normalizing the data with the L1 and L2 norms. From the scikit documentation, it says that this scales all samples of the data to have a L1/L2 norm of 1. Visually I interpret this as scaling all data to lie on the L1 diamond/L2 circle. Is this interpretation correct, and how does it affect a model's ability to classify those points? I imagine that if my interpretation is correct, a linear classifier would have a higher Ein.
The text was updated successfully, but these errors were encountered:
Hey, just had a few questions on feature scaling from step 4 of the code.
The first is regarding the impact of scaling/centering the data on Ein. We discussed in class that the RT of PLA only improves when centering the data, and not when scaling the data. I assume that this improvement carries over into the RT of learning algorithms as well. However, when discussing learning algorithms, we stated that while mean centering does not affect dvc and the generalization error, it often improves Ein. I'm curious why there is an improvement in Ein.
I'm also curious when maxabsscaler is applicable since it only scales the data and thus does not affect dvc or RT.
Lastly, I would like to know more about normalizing the data with the L1 and L2 norms. From the scikit documentation, it says that this scales all samples of the data to have a L1/L2 norm of 1. Visually I interpret this as scaling all data to lie on the L1 diamond/L2 circle. Is this interpretation correct, and how does it affect a model's ability to classify those points? I imagine that if my interpretation is correct, a linear classifier would have a higher Ein.
The text was updated successfully, but these errors were encountered: