Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inverted Uplift scores of revert label (CLassTransformation) ? #199

Open
CoteDave opened this issue Oct 13, 2022 · 4 comments
Open

Inverted Uplift scores of revert label (CLassTransformation) ? #199

CoteDave opened this issue Oct 13, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@CoteDave
Copy link

CoteDave commented Oct 13, 2022

🐛 Bug

scikit-uplift==0.5.1

Hi I just compared the uplift scores of the ClassTransformation with other uplift strategies (SoloModel(slearner), xlearner) and the uplift scores of the class transformation seems very off !

As the ClassTransformation transform the target before the training, should it transform back the predicted uplift scores in the output ? How to interpret the scores compared to the other strategies ? Is this a bug ?

image

And if we draw the qini curves, we clearly see that the uplift scores of the revert label (ClassTransformation) seem inverted:
image

Nothing special in the .fit(), X, Y, T, Estimator are the same for the SoloModel() and the ClassTransformation()

@CoteDave CoteDave changed the title Uplift scores of revert label (CLassTransformation) is ok ? Uplift scores of revert label (CLassTransformation) output inverted ? Oct 13, 2022
@CoteDave CoteDave changed the title Uplift scores of revert label (CLassTransformation) output inverted ? Inverted Uplift scores of revert label (CLassTransformation) ? Oct 13, 2022
@maks-sh
Copy link
Owner

maks-sh commented Oct 19, 2022

@CoteDave Hello!

Thanks a lot for providing information.
A very strange problem.

Could you provide the code and data to reproduce this bug?

@maks-sh maks-sh added the bug Something isn't working label Oct 19, 2022
@CoteDave
Copy link
Author

Hi Maksim,

here is the code (As you can see, pretty simple out of the box and exactly the same data is used for SoloModel and ClassTransformation but the predicted uplift scores for ClassTransformation seems distributed weirdly). The other 2 models are SLearner and XLearner from the Econml library.

image

Unfortunately, I can't share the data (enterprise). Some facts about the set up:

  • X shape: (557622, 136) + Categorical features are target encoded with category_encoders library. No scaling as the model is a gradient boosting (CatBoost)
  • Y (557622, 1): Binary target, with only 2,3763% of 1
  • T (557622, 1): Binary treatment with only 4.6662% of 1
  • propensity_model1 = CatBoostClassifier(n_estimators=1000, max_depth=6, learning_rate = 0.08, silent = True, early_stopping_rounds = 6)

And that's it!

@maks-sh
Copy link
Owner

maks-sh commented Oct 19, 2022

Thanks a lot!

We will try to reproduce the bug, and then we will return with the results 🙌

@DaveCoteDS
Copy link

Hi, I found the problem.

The Class Transformation is only made for balanced T0, T1 datasets. My dataset is highly skewed (fewer T1). Soo I can't use the class transformation algorithms as the first assumption is that the T1 and T0 are balanced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants