Outcome-Based RL Provably Leads Transformers to Reason, but Only With the Right Data
Preprint
Hello! I'm currently a second-year PhD student at Tel Aviv University, working under the guidance ofProf. Nadav Cohen.
My passion lies in the theoretical exploration of fundamental principles underlying deep learning models.
Beyond academics, I enjoy meditation 🧘♂️, boxing 🥊, and experimenting with mixology 🍸.