top of page
Caglar Gulcehre head shot

Caglar Gulcehre

Professor and Lead of CLAIRE lab @ EPFL

Research Consultant at Google DeepMind

Ex: Staff Research Scientist @ Google DeepMind

Google Scholar: Click here!

Twitter: caglarml@

Github: github.com/caglar ***Not up to date!***

Email: ca9lar At Gmail

Location: Lausanne, Switzerland 

Important note for students: I receive a lot of emails from students, and it is impossible for me to reply to all of them. I will post it here when our process is in place to remedy it. For now, instead of cold-emailing me directly for PhD or MSc positions, please check this page. I am not planning to hire PhD students for 2024-2025 but hiring postdocs.

Home: Welcome
_edited_edited.jpg

Bio

I am currently a professor at EPFL and leading the CLAIRE research lab. I was a staff research scientist in Google DeepMind working on the intersection of Reinforcement Learning, Foundation Models, Novel Archtiectures, safety + Alignment and Natural Language Understanding. I have led or co-led several projects during my time at DeepMind ranging from next generation of sequence modeling architectures, alignment and safety to offline RL.

I am interested in building agents that can learn from a feedback signal (often weak, sparse, and noisy in the real world) while utilizing unlabeled data available in the environment. I am interested in improving our understanding of the existing algorithms and developing new ones to enable real-world applications with positive social impact. I am particularly fascinated by the scientific applications of machine learning algorithms. I enjoy working in a multi/cross-disciplinary team and am often inspired by neuroscience, biology, and cognitive sciences when working on algorithmic solutions.  

I finished my Ph.D. under the supervision of Yoshua Bengio at MILA.

I defended my thesis "Learning and time: on using memory and curricula for language understanding" in 2018 with Christopher Manning as my external examiner. Currently, the research topics that I am working on include but are not limited to reinforcement learning, offline RL, large-scale deep architectures (or foundational models. as they call it these days), and representation learning (including self-supervised learning, new architectures, causal representations, etc.) I have served as an area chair and reviewer to significant machine learning conferences such as ICML, NeurIPS, ICLR, and journals like Nature and JMLR. I  have published at numerous influential conferences and journals such as Nature, JMLR, NeurIPS, ICML, ICLR, ACL, EMNLP, etc... My work has received the best paper award at the Nonconvex Optimization workshop at NeurIPS and an honorable mention for best paper at ICML 2019. ​I have co-organized the Science and Engineering of Deep Learning workshops and three other workshops at NeurIPS, ICML, and ICLR.

Neural Networks
Language
Computation
Brain, Cognition
Home: Welcome

Recent Updates

​​​

Selected Publication

Regularized Behavior Value Estimation

BVE in realworld

Authors

Caglar GulcehreSergio Gómez ColmenarejoZiyu WangJakub SygnowskiThomas PaineKonrad ZolnaYutian ChenMatthew HoffmanRazvan PascanuNando de Freitas

Abstract

Offline reinforcement learning restricts the learning process to rely only on logged data without access to an environment. While this enables real-world applications, it also poses unique challenges. One important challenge is dealing with errors caused by overestimating values for state-action pairs not well-covered by the training data. Due to bootstrapping, these errors get amplified during training and can lead to divergence, thereby crippling learning. To overcome this challenge, we introduce Regularized Behavior Value Estimation (R-BVE). Unlike most approaches, which use policy improvement during training, R-BVE estimates the value of the behavior policy during training and only performs policy improvement at deployment time.

Further, R-BVE uses a ranking regularisation term that favors actions in the dataset that lead to successful outcomes. We provide ample empirical evidence of R-BVE's effectiveness, including state-of-the-art performance on the RL Unplugged ATARI dataset. We also test R-BVE on new datasets, from suite and a challenging DeepMind Lab task, and show that R-BVE outperforms other state-of-the-art discrete control offline RL methods.

Work Experience

epfl logo

EPFL (2023) 
Prof and Lead of CLAIRE lab

deepmind logo
microsoft research logo

DeepMind (2017-)
Research
Scientist 

 

MSR (2016) 
Part-time researcher

IBM research logo

IBM Research (2015-2016)
Research Intern

deepmind logo

DeepMind (2014)
Research Intern

google deepmind logo

Google Deepmind (2024)

Research Consultant

maluuba logo

Maluuba (2015)
Part-time researcher

tubitak logo

Tubitak (2008-2011)
Researcher

MILA logo

MILA (2012-2017)
PhD and Research Assistant

odtu logo

METU (2008-2010)
Software Engineer

Home: About

Caglar Gulcehre

London, United Kingdom

  • mastodon_logo_icon_145082
  • Twitter
  • LinkedIn

Thanks for submitting!

Home: Contact
bottom of page