Parse AZ_ML host list and construct resource pool for deepspeed #2535
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The AML v2 does not have a hostfile and prefers a comma-separated host list (e.g. 10.0.0.4, 10.0.0.5,....).
This PR is to enable this based on two env vars. that AML team has shared and are going to be a contract moving forward. @mrwyattii and I prototyped this together for AML team to try it out.
@savitamittal1 -- please try out this branch/PR in your AML environment and let us know if this works.
CC - @jeffra