You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, MLJModels.jl has ContinuousEncoder, which automatically transforms into one of two codings:
Dummy variable coding (one-hot with last category dropped)
Redundant variable coding (one-hot)
However, these aren't always the most useful codings for effective regularization, and there are many others in common use. For example, the most common way to encode ordinal variables is with sequential difference encoding; with this encoding, regularization pulls adjacent categories closer together, which improves model performance relative to either treating ordered variables as categorical (discarding ordering information) or treating them as continuous (using an equal-distance assumption that is often incorrect). Similarly, effect coding allows you to regularize categories towards the grand mean (rather than regularize every category towards 0, or regularize all categories towards one other category).
The text was updated successfully, but these errors were encountered:
Right now, MLJModels.jl has ContinuousEncoder, which automatically transforms into one of two codings:
However, these aren't always the most useful codings for effective regularization, and there are many others in common use. For example, the most common way to encode ordinal variables is with sequential difference encoding; with this encoding, regularization pulls adjacent categories closer together, which improves model performance relative to either treating ordered variables as categorical (discarding ordering information) or treating them as continuous (using an equal-distance assumption that is often incorrect). Similarly, effect coding allows you to regularize categories towards the grand mean (rather than regularize every category towards 0, or regularize all categories towards one other category).
The text was updated successfully, but these errors were encountered: