About me

I am a Research Engineering Team Lead at DeepMind in London, working on problems related to Deep Reinforcement Learning: exploration, representation learning, offline RL, and more generally the engineering of large-scale RL agents.

Before that, I completed a PhD in mathematics under the supervision of Prof. Sebastian van Strien at the University of Warwick. My PhD research revolved around Dynamical Systems in Game Theory, in particular Fictitious Play Dynamics and related learning processes, and the dynamics of piecewise affine maps and flows.

Publications

2024

An Analysis of Quantile Temporal-Difference Learning
M. Rowland, R. Munos, M. Gheshlaghi Azar, Y. Tang, G. Ostrovski, A. Harutyunyan, K. Tuyls, M. G. Bellemare, W. Dabney. Journal of Machine Learning Research, Volume 25, pp.1-47, 2024 (journal paper, arXiv)

2023

Quantile Credit Assignment
T. Mesnard, W. Chen, A. Saade, Y. Tang, M. Rowland, T. Weber, C. Lyle, A. Gruslys, M. Valko, W. Dabney, G. Ostrovski, E. Moulines, R. Munos. Proceedings of the 40th International Conference on Machine Learning (ICML’23), Honolulu, Hawaii, USA. PMLR 202, 2023 (conference paper)

Deep Reinforcement Learning with Plasticity Injection
E. Nikishin, J. Oh, G. Ostrovski, C. Lyle, R. Pascanu, W. Dabney, A. Barreto. 37th Conference on Neural Information Processing Systems (NeurIPS 2023) (conference paper, arXiv)

2022

An empirical study of implicit regularization in deep offline RL
C. Gulcehre, S. Srinivasan, J. Sygnowski, G. Ostrovski, M. Farajtabar, M. Hoffman, R. Pascanu, A. Doucet. Transactions on Machine Learning Research, 2022 (journal paper, arXiv)

2021

The Difficulty of Passive Learning in Deep Reinforcement Learning
G. Ostrovski, P. S. Castro, W. Dabney. 35th Conference on Neural Information Processing Systems (NeurIPS 2021) (conference paper, supplementary, arXiv, P.S. Castro’s blog post)

When should agents explore?
M. Pislar, D. Szepesvari, G. Ostrovski, D. Borsa, T. Schaul. arXiv preprint, 2021 (arXiv)

Return-based Scaling: Yet Another Normalisation Trick for Deep RL
T. Schaul, G. Ostrovski, Iurii Kemaev, D. Borsa. arXiv preprint, 2021 (arXiv)

On The Effect of Auxiliary Tasks on Representation Dynamics
C. Lyle, M. Rowland, G. Ostrovski, W. Dabney. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, San Diego, California, USA. PMLR 130, 2021 (conference paper, arXiv)

Temporally-Extended ε-Greedy Exploration
W. Dabney, G. Ostrovski, A. Barreto. 9th International Conference on Learning Representations (ICLR’21), 2021 (conference paper, arXiv)

2019

Adapting Behaviour for Learning Progress
T. Schaul, D. Borsa, D. Ding, D. Szepesvari, G. Ostrovski, W. Dabney, S. Osindero. arXiv preprint, 2019 (arXiv)

Recurrent Experience Replay in Distributed Reinforcement Learning
S. Kapturowski, G. Ostrovski, J. Quan, R. Munos, W. Dabney. 7th International Conference on Learning Representations (ICLR’19), 2019 (conference paper)

2018

Autoregressive Quantile Networks for Generative Modeling
G. Ostrovski, W. Dabney, R. Munos. Proceedings of the 35th International Conference on Machine Learning (ICML’18), Stockholm, Sweden. PMLR 80, 2018 (conference paper, supplementary, arXiv)

Implicit Quantile Networks for Distributional Reinforcement Learning
W. Dabney, G. Ostrovski, D. Silver, R. Munos. Proceedings of the 35th International Conference on Machine Learning (ICML’18), Stockholm, Sweden. PMLR 80, 2018 (conference paper, supplementary, arXiv)

Symmetric Decomposition of Asymmetric Games
K. Tuyls, J. Pérolat, M. Lanctot, G. Ostrovski, R. Savani, J. Z. Leibo, T. Ord, T. Graepel, S. Legg. Scientific Reports 8, 2018 (journal paper, arXiv)

Rainbow: Combining Improvements in Deep Reinforcement Learning
M. Hessel, J. Modayil, H. van Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B. Piot, M. Azar, D. Silver. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018 (conference paper, arXiv)

2017

Count-Based Exploration with Neural Density Models
G. Ostrovski, M. G. Bellemare, A. van den Oord, R. Munos. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia. PMLR 70, 2017 (conference paper, supplementary, arXiv)

2016

Hybrid computing using a neural network with dynamic external memory
A. Graves, G. Wayne, M. Reynolds, T. Harley, I. Danihelka, A. Grabska-Barwińska, S. Gómez Colmenarejo, E. Grefenstette, T. Ramalho, J. Agapiou, A. Puigdomènech Badia, K. M. Hermann, Y. Zwols, G. Ostrovski, A. Cain, H. King, C. Summerfield, P. Blunsom, K. Kavukcuoglu, D. Hassabis. Nature, 2016, Volume 538, pp.471-476 (journal paper)

Unifying Count-Based Exploration and Intrinsic Motivation
M. G. Bellemare, S. Srinivasan, G. Ostrovski, T. Schaul, D. Saxton, R. Munos. Advances in Neural Information Processing Systems 29 (NIPS 2016) (conference paper, arXiv)

Increasing the Action Gap: New Operators for Reinforcement Learning
M. G. Bellemare, G. Ostrovski, A. Guez, P. S. Thomas, R. Munos. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI 2016) (conference paper, arXiv)

2015

Human-level control through deep reinforcement learning
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis. Nature, 2015, Volume 518, pp.529-533 (journal paper)

2014

Payoff Performance of Fictitious Play
G. Ostrovski, S. van Strien. Journal of Dynamics and Games, 2014, Volume 1, Number 4, pp.621-638 (journal paper, arXiv)

Dynamics of a Continuous Piecewise Affine Map of the Square
G. Ostrovski. Physica D: Nonlinear Phenomena, 2014, Volume 271, pp.1-9 (journal paper, arXiv)

2013

Topics arising from Fictitious Play Dynamics
G. Ostrovski. Ph.D. Thesis, 2013, University of Warwick

Fixed Point Theorem for Non-Self Maps of Regions in the Plane
G. Ostrovski. Topology and its Applications, 2013, Vol 160, Issue 7, pp.915-923 (journal paper, arXiv)

2011

Piecewise Linear Hamiltonian Flows Associated to Zero-Sum Games: Transition Combinatorics and Questions on Ergodicity
G. Ostrovski, S. van Strien. Regular and Chaotic Dynamics, 2011, Volume 17, pp.129-154 (journal paper, arXiv)

Presentations and Organized Meetings

I have given several talks on topics related to my work in Reinforcement Learning:

11 June 2018: “Curiosity-Based Exploration in Deep Reinforcement Learning”, SIAM Annual Conference of the SIAM student chapter, Imperial College London
22 March 2018: “Exploration in Deep Reinforcement Learning”, Dynamical Systems Seminar, Imperial College London

In the years of my PhD I have given a number of talks on my research in Dynamical Systems and Game Theory:

12 March 2013: “A Dynamical System Motivated by Games: Fictitious Play and Piecewise Affine Dynamics”, Dynamical Systems and Statistical Physics Seminar, Queen Mary University of London
9 May 2012: “Arnold Diffusion in Fictitious Play”, “From mean-field control to weak KAM dynamics” workshop, University of Warwick
15 June 2011: “Learning Dynamics in Games”, Maths Postgraduate Seminar, Queen Mary University of London
18 May 2011: “Learning Dynamics in Games: Fictitious Play”, Maths Postgraduate Seminar, University of Warwick
15 March 2011: “Fictitious Play Dynamics”, Ergodic Theory and Dynamical Systems Seminar, University of Warwick

I have co-organized (with Julia Slipantschuk) the “One-Day Meeting for PhD Students in Dynamical Systems and Ergodic Theory” (YRM Satellite Meeting), on 26 March 2012, at the University of Warwick.