Skip to content

AGI Watchful Guardians

  • Alignment Newsletter 齐智通讯
  • Home
  • About

Tag: Machine Learning

August 15, 2019August 15, 2019 Xiaohu Zhu AGI, AI Safety, CID, DeepMind

具有避免奖励函数篡改动机的智能体设计

从人工智能安全的角度来看,拥有一个清晰的设计原则和一个清晰的表明了它解决了什么问题的特性描述就意味着我们不必去猜测哪些智能体是安全的。在本文和这篇论文中,我们描述了一种称为当下奖励函数优化的设计原理如何避免奖励函数篡改问题。

Strong, but safe.

Search

Archives

Categories

AGI AI Safety Alignment Beneficial CID CSAGI DeepMind Intelligence KL divergence Machine Learning Theory OpenAI Side effects 未分类

Follow me on Twitter

My Tweets

Blogs I Follow

  • Aceso Under Glass
  • Neil Dewar
  • Windows On Theory
  • Victoria Krakovna
  • Self-Aware Systems
  • Steve Omohundro
  • laspucl2016.wordpress.com/
  • Neural Networks Blog
  • Mina Lee
  • The PolyTCS Project
  • Neil Barton
  • Unstable Ontology
  • What's new
  • William J Shipman
  • Kris Carlson
  • Marco Bonzanini
  • The sideways view
  • Gregory Lewis
  • irenelizihui.wordpress.com/
  • Fight with Infinity

Tags

AAAI AGI AI AIsafety AI Safety Alignment AN bayes-optimal Beneficial AI Books CID COLT CSAGI DeepMind DeepRL Divergence Exploration ICML Incentives KL Learning Machine Learning PapeRman Papers Planning REALab Reinforcement Learning Research reward modeling risks RL SeftEffects Shakir Tutorial UL

Authors

  • Xiaohu Zhu
    • 元式训练成的智能体实现了贝叶斯最优的智能体
    • 关于 F. Chollet 的“关于智能的测量”(2019)
    • REALab:概念化篡改问题
    • 为什么降低训练神经网络的成本仍然是一个挑战
    • Shakir Mohamed 对“好”的想象,改变的使命
Create a website or blog at WordPress.com
Aceso Under Glass

Science, People, and Video Games

Neil Dewar

Windows On Theory

A Research Blog

Victoria Krakovna

Self-Aware Systems

Computing with Meaning and Values

Steve Omohundro

laspucl2016.wordpress.com/

LASP - Learning And Signal Processing

Neural Networks Blog

Just another WordPress.com site

Mina Lee

Ph.D. Student at Stanford

The PolyTCS Project

Massively Collaborative Theoretical Computer Science Projects

Neil Barton

Philosophy, Mathematics, and Logic

Unstable Ontology

by Jessica Taylor

What's new

Updates on my research and expository papers, discussion of open problems, and other maths-related topics. By Terence Tao

William J Shipman

Random things about software development, machine learning and image processing research.

Kris Carlson

Just another WordPress.com weblog

Marco Bonzanini

The sideways view

Looking askance at reality

Gregory Lewis

irenelizihui.wordpress.com/

"Wir müssen wissen — wir werden wissen" -- David Hilbert

Fight with Infinity

Wir müssen wissen, wir werden wissen