Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation. [Citation Graph (, )][DBLP]
Nonparametric Return Distribution Approximation for Reinforcement Learning. [Citation Graph (, )][DBLP]
Least absolute policy iteration for robust value function approximation. [Citation Graph (, )][DBLP]
Active Policy Iteration: Efficient Exploration through Active Learning for Value Function Approximation in Reinforcement Learning. [Citation Graph (, )][DBLP]