-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize dataset builders as families #752
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/752
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 6df4e79 with merge base 26223e9 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
||
Returns: | ||
PreferenceDataset: The preference dataset built from StackExchange paired data. | ||
PreferenceDataset: The preference datßaset built from source paired data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PreferenceDataset: The preference datßaset built from source paired data. | |
PreferenceDataset: The preference dataset built from source paired data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot for the life of me tell you how that happened
@@ -149,7 +149,7 @@ def format( | |||
|
|||
class StackExchangedPairedTemplate(InstructTemplate): | |||
""" | |||
Prompt template for the StackExchangedPaired dataset. | |||
Prompt template for preference datasets similar to StackExchanged paired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right
Context
By moving
source
to a parameter in the builder, we can genrealize dataset builders as "families" (i.e., alpaca family datasets, etc). Also update docstrings.Test plan
CI