Poster
in
Workshop: Mathematical and Empirical Understanding of Foundation Models (ME-FoMo)
Aligning Foundation Models for Language with Preferences through $f$-divergence Minimization
Dongyoung Go · Tomek Korbak · Germàn Kruszewski · Jos Rozen · Nahyeon Ryu · Marc Dymetman
Keywords: [ nlp ] [ Generation with Distributional Control (GDC) ] [ Reinforcement Learning from Human Feedback (RLHF) ] [ preference modeling ] [ f-divergence ] [ language model alignment ] [ Reinforcement Learning with KL penalties ]