Skip to content

AGI Watchful Guardians

  • Alignment Newsletter 齐智通讯
  • Home
  • About

Tag: Reinforcement Learning

August 15, 2019August 15, 2019 Xiaohu Zhu AGI, AI Safety, CID, DeepMind

具有避免奖励函数篡改动机的智能体设计

从人工智能安全的角度来看,拥有一个清晰的设计原则和一个清晰的表明了它解决了什么问题的特性描述就意味着我们不必去猜测哪些智能体是安全的。在本文和这篇论文中,我们描述了一种称为当下奖励函数优化的设计原理如何避免奖励函数篡改问题。

Strong, but safe.

Search

Archives

Categories

AGI AI Safety Alignment Beneficial CID CSAGI DeepMind Intelligence KL divergence Machine Learning Theory OpenAI Side effects 未分类

Follow me on Twitter

My Tweets

Blogs I Follow

  • The Divine Life Communion
  • Aceso Under Glass
  • Neil Dewar
  • Windows On Theory
  • Victoria Krakovna
  • Self-Aware Systems
  • Steve Omohundro
  • laspucl2016.wordpress.com/
  • Neural Networks Blog
  • Mina Lee
  • The PolyTCS Project
  • Neil Barton
  • Unstable Ontology
  • What's new
  • William J Shipman
  • Kris Carlson
  • Marco Bonzanini
  • The sideways view
  • Gregory Lewis
  • irenelizihui.wordpress.com/

Tags

AAAI AGI AI AI Safety AIsafety Alignment AN bayes-optimal Beneficial AI Books CID COLT CSAGI DeepMind DeepRL Divergence Exploration ICML Incentives KL Learning Machine Learning ontological conflicts PapeRman Papers Planning REALab Reinforcement Learning Research reward modeling risks RL SeftEffects Shakir social ontology Tutorial UL

Authors

  • Xiaohu Zhu
    • Compositional game theory reading list
    • 本体论冲突与欧洲人民的故事
    • 读论文:本体危机
    • 语言智能体的对齐
    • 因果影响图的进展
Create a website or blog at WordPress.com
The Divine Life Communion

A Reasonable Theology for Our Time

Aceso Under Glass

Science, People, and Video Games

Neil Dewar

Windows On Theory

A Research Blog

Victoria Krakovna

Self-Aware Systems

Computing with Meaning and Values

Steve Omohundro

laspucl2016.wordpress.com/

LASP - Learning And Signal Processing

Neural Networks Blog

Just another WordPress.com site

Mina Lee

Ph.D. Candidate at Stanford

The PolyTCS Project

Massively Collaborative Theoretical Computer Science Projects

Neil Barton

Philosophy, Mathematics, and Logic

Unstable Ontology

by Jessica Taylor

What's new

Updates on my research and expository papers, discussion of open problems, and other maths-related topics. By Terence Tao

William J Shipman

Random things about software development, machine learning and image processing research.

Kris Carlson

Just another WordPress.com weblog

Marco Bonzanini

The sideways view

Looking askance at reality

Gregory Lewis

irenelizihui.wordpress.com/

"Wir müssen wissen — wir werden wissen" -- David Hilbert

  • Follow Following
    • AGI Watchful Guardians
    • Already have a WordPress.com account? Log in now.
    • AGI Watchful Guardians
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar