Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-class Classification examples #28

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

mshtelma
Copy link
Collaborator

No description provided.

@@ -1,4 +1,4 @@
mlflow>=2.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we removed the mlflow requirement?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch! Added it back.

ipykernel>=6.12
ipython>=7.32
flaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need flaml overall? Or do you think it is a good idea to add it in the notebook or create a sub requirement for this? Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I think there are other examples that use FLAML by default (at least I saw some PRs), so this might be a good idea. This is not a big dependency. I am happy to move it inside the multi-class folder as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall flaml is problematic to install with lightgbm installation difficult. Has this been fixed? If not, prefer to not including it. It's not used in this example anyway.

@mshtelma mshtelma changed the title Multi-class Classification and Text Classification (TF-IDF and Transformers) examples Multi-class Classification examples Dec 19, 2022
@@ -0,0 +1,26 @@
# Binary classification: Is this bottle of wine red or white?
This is the root directory for an example project for the
[MLflow Classification Recipe](https://mlflow.org/docs/latest/recipes.html#classification-recipe).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MLflow Classification Recipe] --> [MLflow Multi Class Classification Recipe]

What do you think?

# COMMAND ----------

# MAGIC %pip install -r ../../requirements.txt
# MAGIC %pip install git+https://github.com/mshtelma/mlflow.git@multiclassclassification
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this once MLFlow is released.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just released MLflow 2.1.0 so we should be good to remove this now :)

Comment on lines +49 to +51
# COMMAND ----------

r.run("split")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do a bit of EDA here before proceeding to split?


# COMMAND ----------

trained_model = r.get_artifact("model")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the model to make a prediction on one hand-generated input example here?

# For different options please read: https://github.com/mlflow/recipes-classification-template#ingest-step
using: csv
loader_method: load_file_as_dataframe
location: "./data/iris.csv"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be an open access Databricks Delta table? Does it exist?

@@ -0,0 +1,31 @@
experiment:
name: "sklearn_classification_experiment"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the experiment name to not collide with binary classification example

INGEST_SCORING_CONFIG:
# For different options please read: https://github.com/mlflow/recipes-classification-template#batch-scoring
using: csv
location: "./data/iris.csv"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to use a different dataset for scoring, if such dataset exists. Or you can create one. See https://github.com/mlflow/recipes-examples/blob/main/classification/profiles/local.yaml#L26 for example.

import pandas

df = pandas.read_csv(file_path, sep=",")
df["class"] = df["class"].astype("category").cat.codes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this line do? What happens if there's no such conversion?

transformers.
"""

return None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment indicating this will result in an identical transformer.

:return: A Series indicating whether each row should be filtered
"""

return Series(True, index=dataset.index)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment to indicate this doesn't filter out anything.

@@ -0,0 +1,10 @@
from steps.transform import transformer_fn
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why introduce an empty file multi-class-classification/tests/train_test.py above?

ipykernel>=6.12
ipython>=7.32
flaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall flaml is problematic to install with lightgbm installation difficult. Has this been fixed? If not, prefer to not including it. It's not used in this example anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants