top of page
"On learning history based policies for controlling Markov Decision Processes", with Aditya Mahajan and Doina Precup, (preprint)**under review
"Finite time analysis of temporal difference learning with linear function approximation: the tail averaged case", with Prashanth L.A and Doina Precup, **preliminary work, RL-Theory workshop ICML 2021 (pdf)
"Variance Penalized On-Policy and Off-Policy Actor-Critic", with Arushi Jain, Ayush Jain, Khimya Khetarpal and Doina Precup, AAAI 2021 (pdf)
For more check google scholar.
bottom of page