ICML 2019 Workshops – #1 ERL

Exploration in RL

Homepage: https://sites.google.com/view/erl-2019/

The following YouTube playlist has all the talks from the workshop:https://www.youtube.com/playlist?list=PLbSAfmOMweH3YkhlH0d5KaRvFTyhcr30b

Slides for all contributed talks are available here:https://docs.google.com/presentation/d/1zkqtsM-GywKN9kzX4r0j-C1SUF5I0N0mgsxpfvJyl7s

Open Problems

Below is a list of open questions related to exploration in reinforcement learning. We encourage researchers working on any of these problems to submit to our workshop.

Is there an important research question about exploration in RL that is missing from this list? Please email us at erl-leads@google.com and we’ll add it!

  1. How to determine whether an agent is doing good, intelligent exploration?
  2. How can we determine when exploration is the bottleneck to efficiently solving a problem?
  3. How can different exploration methods be quantitatively evaluated? What are benefits and limitations of various metrics?
  4. How well do exploration methods generalize across environments? How can this generalization be measured?
  5. If exploration is posed as a learning problem (e.g., meta-learning), what should the learning objective be?
  6. Can exploration be cast as a problem of causal inference?
  7. What insight can be gained by casting exploration as unsupervised or semi-supervised learning?
  8. What exploration techniques are most effective in environments with very constrained environments (e.g., robots with physical constraints)?
  9. Do hierarchical approaches to exploration (e.g., with options) improve sample efficiency?
  10. Are certain exploration methods better suited to domain-specific applications (e.g., education, healthcare, robotics)?
  11. What insight can be gained by bridging the gap between reinforcement learning and bandits?
  12. What does exploration mean for evolutionary algorithms?
  13. What are the benefits of Bayesian exploration (e.g., safety, information gain)?
  14. Can ensembles of policies and/or value functions enable faster or safer exploration?
  15. What are the tradeoffs of including diversity objectives in exploration?
  16. Does safe exploration necessarily come at the cost of worse sample efficiency?
  17. How can exploration be down in a continual learning environment with no human supervision (i.e., no resets, no rewards)?
  18. Can auxiliary exploration objectives be cast in a unified framework?
  19. How can insights from intuitive physics and cognitive neuroscience improve exploration techniques?
  20. What insight can be gained by casting exploration as experimental design?
  21. What conceptual or theoretical frameworks might allow researchers to bridge to gap between the theory and practice of exploration in RL?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s