Publications:

"On learning history based policies for controlling Markov Decision Processes", with Aditya Mahajan and Doina Precup, (preprint)**under review

"Finite time analysis of temporal difference learning with linear function approximation: the tail averaged case", with Prashanth L.A and Doina Precup, **preliminary work, RL-Theory workshop ICML 2021 (pdf)

"Variance Penalized On-Policy and Off-Policy Actor-Critic", with Arushi Jain, Ayush Jain, Khimya Khetarpal and Doina Precup, AAAI 2021 (pdf)

For more check google scholar.