Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Code Improvement] Support concatnate forward in reward trainer #1769

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

1485840691
Copy link
Contributor

This PR is to address a previous code improvement suggestion that in reward trainer, we could borrow the same idea from DPOTrainer to concatenate chosen and rejected tokens to save one model forward call(). The pitfall of this concatenate forward is increase GPU memory. So add a flag to control on/off of this improvement feature.

@1485840691 1485840691 marked this pull request as draft June 24, 2024 12:02
@vwxyzjn
Copy link
Collaborator

vwxyzjn commented Jun 24, 2024

Looks like a great change! Thanks @1485840691 for the PR

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@vwxyzjn
Copy link
Collaborator

vwxyzjn commented Jun 24, 2024

Make sure you do make precommit

@1485840691 1485840691 marked this pull request as ready for review June 25, 2024 06:56
@1485840691-eng
Copy link
Contributor

Make sure you do make precommit

Done precommit check. Please help review. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants