30
/
AIzaSyAYiBZKx7MnpbEhh9jyipgxe19OcubqV5w
April 1, 2024
8561938
815425
2

sep 1, 2020 - RLHF

Description:

Focus: Use a reward model which predicts human preferences to fine tune a pre-trained model

Added to timeline:

9 months ago

Date:

sep 1, 2020
Now
~ 3 years and 8 months ago