Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[WIP] Unify Policy Trainers
#1586 opened Apr 25, 2024 by lapp0 Draft
4 tasks
[WIP] Add WinRateCallback
#1598 opened Apr 29, 2024 by lewtun Draft
2 of 5 tasks
Adds Online DPO
#1605 opened Apr 30, 2024 by edbeeching Draft
[DRAFT] Vllm integration
#1628 opened May 7, 2024 by vwxyzjn Draft
Prototype Dataset Processor
#1646 opened May 16, 2024 by vwxyzjn Loading…
Adding SimPO to TRL
#1725 opened Jun 11, 2024 by yumeng5 Loading…
rloo and ppov2 trainer with trainer callbacks
#1729 opened Jun 13, 2024 by mnoukhov Loading…
allow ref model use ds stage3 only
#1730 opened Jun 13, 2024 by gromzhu Loading…
Fix GPT2 sentiment notebook reward
#1738 opened Jun 14, 2024 by cemiu Loading…
Issue #1751 Fix
#1754 opened Jun 18, 2024 by yash-srivastava19 Loading…
SFTTrainer to add support for IterableDataset
#1761 opened Jun 21, 2024 by helloworld1 Loading…
Add SRPO algorithm.
#1772 opened Jun 25, 2024 by frasermince Draft
fix model to save in ppov2
#1776 opened Jun 26, 2024 by mnoukhov Loading…
Nash md
#1779 opened Jun 27, 2024 by kashif Draft
Fix start index under batched_forward_pass
#1782 opened Jun 27, 2024 by mertsayar8 Loading…
[SFT] add model_init_kwargs to training_args
#1787 opened Jun 28, 2024 by kashif Loading…
ProTip! What’s not been updated in a month: updated:<2024-05-28.