-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixing mlflow logging to Databricks workspace file paths with /Shared/ prefix #3410
Conversation
Im a little confused, can you give examples of initial user specified arg and transformed arg before and after this PR as an example? |
@mvpatel2000 In the original code prior to this PR, inputting an experiment name starting with '/Shared/' would throw this error: mlflow.exceptions.RestException: RESOURCE_ALREADY_EXISTS: Node named ___ already exists Now, mlflow logger correctly logs to the '/Shared/...' folder path as an absolute path. As mentioned above, there are probably some choices we want to make surrounding how we can most intelligently interpret absolute vs relative paths, but this serves as a functioning bandaid solution, because there are only '/Users/' and '/Shared/' under workspaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets move the skip on Shared
to be where Users
is checked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…/ prefix (mosaicml#3410) * fixing os file path with /Shared/ prefix * lstrip '/' from experiment name if not '/Shared/' or '/Users/' Co-authored-by: Mihir Patel <[email protected]> * doesnt modify experiment name if it has '/Shared/' as a prefix * fix formatting * lint --------- Co-authored-by: Mihir Patel <[email protected]>
…/ prefix (#3410) * fixing os file path with /Shared/ prefix * lstrip '/' from experiment name if not '/Shared/' or '/Users/' Co-authored-by: Mihir Patel <[email protected]> * doesnt modify experiment name if it has '/Shared/' as a prefix * fix formatting * lint --------- Co-authored-by: Mihir Patel <[email protected]>
In mlflow logging, users that wanted to log into the "/Shared/" file directory had an extra slash in their modified experiment names, leading to Mlflow not recognizing existing experiments, trying to create them, and then claiming they already exist.
This PR accounts for that edge case and removes the extra slash added.
Prior to this PR, there was a bug where users who directly wanted to mlflow-log to the
/Shared/
workspace would be able to do so after the first run of an experiment name, but would not be able to for following runs of the same experiment name. We found that an extra '/' was prepended to the experiment name (yielding an experiment name of//Shared/...
after'/' + os.path.join('Users', databricks_username, experiment_name)
), leading toget_experiment_by_name
not finding the existing experiment, andcreate_experiment
claiming that it already exists (which it does). There are probably ways in whichget_experiment_name
andcreate_experiment
name handle file directories differently, also contributing to this bug.In the original code prior to this PR, inputting an experiment name starting with
/Shared/
would throw this error:mlflow.exceptions.RestException: RESOURCE_ALREADY_EXISTS: Node named ___ already exists
Now, mlflow logger correctly logs to the
/Shared/...
folder path as an absolute path.Before:
/Shared/...
->/Users/Shared/...
/Users/...
->/Users/...
path
->/Users/path
After:
/Shared/...
->/Shared/...
(no change)/Users/...
->/Users/...
path
->/Users/path