41
/
AIzaSyB4mHJ5NPEv-XzF7P6NDYXjlkCWaeKw5bc
May 31, 2026
8561938
815425
2
Public Timelines
FAQ

sep 1, 2020 - RLHF

Description:

Focus: Use a reward model which predicts human preferences to fine tune a pre-trained model

Added to timeline:

14 Aug 2023
0
0
1775

Date:

sep 1, 2020
Now
~ 5 years and 9 months ago