Global Search and Discovery with Differential Policy Optimization
Bharti 501 IIT Campus, Hauz Khas, New DelhiChandrajit Bajaj, UT Austin Reinforcement learning (RL) with continuous state and action spaces is arguably one the most challenging problems within the field of machine learning. Most current learning methods… Read More »Global Search and Discovery with Differential Policy Optimization