Search the dblp DataBase
Rémi Munos :
[Publications ]
[Author Rank by year ]
[Co-authors ]
[Prefers ]
[Cites ]
[Cited by ]
Publications of Author
Rémi Munos Error Bounds for Approximate Value Iteration. [Citation Graph (0, 0)][DBLP ] AAAI, 2005, pp:1006-1011 [Conf ] Rémi Munos Geometric Variance Reduction in Markov Chains. Application to Value Function and Gradient Estimation. [Citation Graph (0, 0)][DBLP ] AAAI, 2005, pp:1012-1017 [Conf ] Rémi Munos Policy gradient in continuous time. [Citation Graph (0, 0)][DBLP ] CAP, 2005, pp:201-216 [Conf ] András Antos , Csaba Szepesvári , Rémi Munos Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. [Citation Graph (0, 0)][DBLP ] COLT, 2006, pp:574-588 [Conf ] Rémi Munos Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems. [Citation Graph (0, 0)][DBLP ] ECML, 1997, pp:170-182 [Conf ] Rémi Munos A General Convergence Method for Reinforcement Learning in the Continuous Case. [Citation Graph (0, 0)][DBLP ] ECML, 1998, pp:394-405 [Conf ] Rémi Munos Error Bounds for Approximate Policy Iteration. [Citation Graph (0, 0)][DBLP ] ICML, 2003, pp:560-567 [Conf ] Rémi Munos A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning. [Citation Graph (0, 0)][DBLP ] ICML, 1996, pp:337-345 [Conf ] Rémi Munos , Andrew W. Moore Rates of Convergence for Variable Resolution Schemes in Optimal Control. [Citation Graph (0, 0)][DBLP ] ICML, 2000, pp:647-654 [Conf ] Csaba Szepesvári , Rémi Munos Finite time bounds for sampling based fitted value iteration. [Citation Graph (0, 0)][DBLP ] ICML, 2005, pp:880-887 [Conf ] Rémi Munos A Convergent Reinforcement Learning Algorithm in the Continuous Case Based on a Finite Difference Method. [Citation Graph (0, 0)][DBLP ] IJCAI (2), 1997, pp:826-831 [Conf ] Rémi Munos , Andrew W. Moore Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems. [Citation Graph (0, 0)][DBLP ] IJCAI, 1999, pp:1348-1355 [Conf ] Rémi Munos Efficient Resources Allocation for Markov Decision Processes. [Citation Graph (0, 0)][DBLP ] NIPS, 2001, pp:1571-1578 [Conf ] Rémi Munos , Paul Bourgine Reinforcement Learning for Continuous Stochastic Control Problems. [Citation Graph (0, 0)][DBLP ] NIPS, 1997, pp:- [Conf ] Rémi Munos , Andrew W. Moore Barycentric Interpolators for Continuous Space and Time Reinforcement Learning. [Citation Graph (0, 0)][DBLP ] NIPS, 1998, pp:1024-1030 [Conf ] Rémi Munos Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation. [Citation Graph (0, 0)][DBLP ] Journal of Machine Learning Research, 2006, v:7, n:, pp:413-427 [Journal ] Rémi Munos Policy Gradient in Continuous Time. [Citation Graph (0, 0)][DBLP ] Journal of Machine Learning Research, 2006, v:7, n:, pp:771-791 [Journal ] Rémi Munos A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions. [Citation Graph (0, 0)][DBLP ] Machine Learning, 2000, v:40, n:3, pp:265-299 [Journal ] Rémi Munos , Andrew W. Moore Variable Resolution Discretization in Optimal Control. [Citation Graph (0, 0)][DBLP ] Machine Learning, 2002, v:49, n:2-3, pp:291-323 [Journal ] Jean-Yves Audibert , Rémi Munos , Csaba Szepesvári Tuning Bandit Algorithms in Stochastic Environments. [Citation Graph (0, 0)][DBLP ] ALT, 2007, pp:150-165 [Conf ] Pierre-Arnaud Coquelin , Rémi Munos Bandit Algorithms for Tree Search [Citation Graph (0, 0)][DBLP ] CoRR, 2007, v:0, n:, pp:- [Journal ] Pure Exploration in Multi-armed Bandits Problems. [Citation Graph (, )][DBLP ] Adaptive play in Texas Hold'em Poker. [Citation Graph (, )][DBLP ] Workshop summary: On-line learning with limited feedback. [Citation Graph (, )][DBLP ] Analysis of a Classification-based Policy Iteration Algorithm. [Citation Graph (, )][DBLP ] Finite-Sample Analysis of LSTD. [Citation Graph (, )][DBLP ] Fitted Q-iteration in continuous action-space MDPs. [Citation Graph (, )][DBLP ] Algorithms for Infinitely Many-Armed Bandits. [Citation Graph (, )][DBLP ] Particle Filter-based Policy Gradient in POMDPs. [Citation Graph (, )][DBLP ] Online Optimization in X-Armed Bandits. [Citation Graph (, )][DBLP ] Online Learning in Adversarial Lipschitz Environments. [Citation Graph (, )][DBLP ] Optimistic Planning of Deterministic Systems. [Citation Graph (, )][DBLP ] Pure Exploration for Multi-Armed Bandit Problems [Citation Graph (, )][DBLP ] X-Armed Bandits [Citation Graph (, )][DBLP ] Search in 0.002secs, Finished in 0.003secs