Skip to content

AGI Watchful Guardians

  • Home
  • About
  • Alignment Newsletter in Chinese
  • Nick Bostrom’s latest work in Chinese
  • Research

Tag: Machine Learning

August 15, 2019August 15, 2019 Xiaohu Zhu AGI, AI Safety, CID, DeepMind

具有避免奖励函数篡改动机的智能体设计

从人工智能安全的角度来看,拥有一个清晰的设计原则和一个清晰的表明了它解决了什么问题的特性描述就意味着我们不必去猜测哪些智能体是安全的。在本文和这篇论文中,我们描述了一种称为当下奖励函数优化的设计原理如何避免奖励函数篡改问题。

Strong, but safe.

Search

Archives

Categories

AGI AI Safety Alignment Beneficial CID CSAGI DeepMind Intelligence KL divergence Machine Learning Theory OpenAI Side effects 未分类

Follow me on Twitter

My Tweets

Blogs I Follow

  • Foundation Operation X for languages, cultures and perspectives
  • Brian Lui's blog
  • The Divine Life Communion
  • Aceso Under Glass
  • Windows On Theory
  • Victoria Krakovna
  • Self-Aware Systems
  • Steve Omohundro
  • laspucl2016.wordpress.com/
  • Neural Networks Blog
  • Mina Lee
  • The PolyTCS Project
  • Neil Barton
  • Unstable Ontology
  • What's new
  • William J Shipman
  • Kris Carlson
  • Marco Bonzanini
  • The sideways view
  • Gregory Lewis

Tags

AAAI AGI AI AI Safety AIsafety Alignment AN bayes-optimal Beneficial AI Books CID COLT CSAGI DeepMind DeepRL Divergence Exploration HAI ICML Incentives KL Learning Machine Learning Nick Bostrom ontological conflicts PapeRman Papers Planning REALab Reinforcement Learning Research reward modeling risks RL SeftEffects Shakir social ontology Stanford Tutorial UL

Authors

  • Xiaohu Zhu
    • 人工智能书籍推荐:将这些添加到您的阅读列表
    • 齐智通讯 第 173 期 来自DeepMind的语言模型
    • Compositional game theory reading list
    • 本体论冲突与欧洲人民的故事
    • 读论文:本体危机
Create a website or blog at WordPress.com
Foundation Operation X for languages, cultures and perspectives

Brian Lui's blog

The Divine Life Communion

A Reasonable Theology for Our Time

Aceso Under Glass

What if we understood more things?

Windows On Theory

A Research Blog

Victoria Krakovna

Self-Aware Systems

Computing with Meaning and Values

Steve Omohundro

laspucl2016.wordpress.com/

LASP - Learning And Signal Processing

Neural Networks Blog

Just another WordPress.com site

Mina Lee

Ph.D. Candidate at Stanford

The PolyTCS Project

Massively Collaborative Theoretical Computer Science Projects

Neil Barton

Philosophy, Mathematics, and Logic

Unstable Ontology

by Jessica Taylor

What's new

Updates on my research and expository papers, discussion of open problems, and other maths-related topics. By Terence Tao

William J Shipman

Random things about software development, machine learning and image processing research.

Kris Carlson

Just another WordPress.com weblog

Marco Bonzanini

The sideways view

Looking askance at reality

Gregory Lewis

  • Follow Following
    • AGI Watchful Guardians
    • Already have a WordPress.com account? Log in now.
    • AGI Watchful Guardians
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar