Rémi Munos:
**Error Bounds for Approximate Value Iteration.**[Citation Graph (0, 0)][DBLP] AAAI, 2005, pp:1006-1011 [Conf] - Rémi Munos
**Geometric Variance Reduction in Markov Chains. Application to Value Function and Gradient Estimation.**[Citation Graph (0, 0)][DBLP] AAAI, 2005, pp:1012-1017 [Conf] - Rémi Munos
**Policy gradient in continuous time.**[Citation Graph (0, 0)][DBLP] CAP, 2005, pp:201-216 [Conf] - András Antos, Csaba Szepesvári, Rémi Munos
**Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path.**[Citation Graph (0, 0)][DBLP] COLT, 2006, pp:574-588 [Conf] - Rémi Munos
**Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems.**[Citation Graph (0, 0)][DBLP] ECML, 1997, pp:170-182 [Conf] - Rémi Munos
**A General Convergence Method for Reinforcement Learning in the Continuous Case.**[Citation Graph (0, 0)][DBLP] ECML, 1998, pp:394-405 [Conf] - Rémi Munos
**Error Bounds for Approximate Policy Iteration.**[Citation Graph (0, 0)][DBLP] ICML, 2003, pp:560-567 [Conf] - Rémi Munos
**A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning.**[Citation Graph (0, 0)][DBLP] ICML, 1996, pp:337-345 [Conf] - Rémi Munos, Andrew W. Moore
**Rates of Convergence for Variable Resolution Schemes in Optimal Control.**[Citation Graph (0, 0)][DBLP] ICML, 2000, pp:647-654 [Conf] - Csaba Szepesvári, Rémi Munos
**Finite time bounds for sampling based fitted value iteration.**[Citation Graph (0, 0)][DBLP] ICML, 2005, pp:880-887 [Conf] - Rémi Munos
**A Convergent Reinforcement Learning Algorithm in the Continuous Case Based on a Finite Difference Method.**[Citation Graph (0, 0)][DBLP] IJCAI (2), 1997, pp:826-831 [Conf] - Rémi Munos, Andrew W. Moore
**Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems.**[Citation Graph (0, 0)][DBLP] IJCAI, 1999, pp:1348-1355 [Conf] - Rémi Munos
**Efficient Resources Allocation for Markov Decision Processes.**[Citation Graph (0, 0)][DBLP] NIPS, 2001, pp:1571-1578 [Conf] - Rémi Munos, Paul Bourgine
**Reinforcement Learning for Continuous Stochastic Control Problems.**[Citation Graph (0, 0)][DBLP] NIPS, 1997, pp:- [Conf] - Rémi Munos, Andrew W. Moore
**Barycentric Interpolators for Continuous Space and Time Reinforcement Learning.**[Citation Graph (0, 0)][DBLP] NIPS, 1998, pp:1024-1030 [Conf] - Rémi Munos
**Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation.**[Citation Graph (0, 0)][DBLP] Journal of Machine Learning Research, 2006, v:7, n:, pp:413-427 [Journal] - Rémi Munos
**Policy Gradient in Continuous Time.**[Citation Graph (0, 0)][DBLP] Journal of Machine Learning Research, 2006, v:7, n:, pp:771-791 [Journal] - Rémi Munos
**A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions.**[Citation Graph (0, 0)][DBLP] Machine Learning, 2000, v:40, n:3, pp:265-299 [Journal] - Rémi Munos, Andrew W. Moore
**Variable Resolution Discretization in Optimal Control.**[Citation Graph (0, 0)][DBLP] Machine Learning, 2002, v:49, n:2-3, pp:291-323 [Journal] - Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári
**Tuning Bandit Algorithms in Stochastic Environments.**[Citation Graph (0, 0)][DBLP] ALT, 2007, pp:150-165 [Conf] - Pierre-Arnaud Coquelin, Rémi Munos
**Bandit Algorithms for Tree Search**[Citation Graph (0, 0)][DBLP] CoRR, 2007, v:0, n:, pp:- [Journal] **Pure Exploration in Multi-armed Bandits Problems.**[Citation Graph (, )][DBLP]**Adaptive play in Texas Hold'em Poker.**[Citation Graph (, )][DBLP]**Workshop summary: On-line learning with limited feedback.**[Citation Graph (, )][DBLP]**Analysis of a Classification-based Policy Iteration Algorithm.**[Citation Graph (, )][DBLP]**Finite-Sample Analysis of LSTD.**[Citation Graph (, )][DBLP]**Fitted Q-iteration in continuous action-space MDPs.**[Citation Graph (, )][DBLP]**Algorithms for Infinitely Many-Armed Bandits.**[Citation Graph (, )][DBLP]**Particle Filter-based Policy Gradient in POMDPs.**[Citation Graph (, )][DBLP]**Online Optimization in X-Armed Bandits.**[Citation Graph (, )][DBLP]**Online Learning in Adversarial Lipschitz Environments.**[Citation Graph (, )][DBLP]**Optimistic Planning of Deterministic Systems.**[Citation Graph (, )][DBLP]**Pure Exploration for Multi-Armed Bandit Problems**[Citation Graph (, )][DBLP]**X-Armed Bandits**[Citation Graph (, )][DBLP]
