More download options

Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance

W. Bradley Knox & Peter Stone

Artificial Intelligence 225 (C):24-50 (2015) Copy BIBT_EX

Abstract

This article has no associated abstract. (fix it)

Cite

Plain text

BibTeX

Formatted text

Reference Manager

RefWorks

Options

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Categories

Science, Logic, and Mathematics

Keywords

Reprint years

DOI

10.1016/j.artint.2015.03.009

Links

PhilArchive

Upload a copy of this work Papers currently archived: 93,867

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Sign in / register and customize your OpenURL resolver
Configure custom resolver

My notes

Sign in to use this feature

Similar books and articles

Learning reward machines: A study in partially observable reinforcement learning.Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Margarita P. Castro, Ethan Waldie & Sheila A. McIlraith - 2023 - Artificial Intelligence 323 (C):103989.

Model-based average reward reinforcement learning.Prasad Tadepalli & DoKyeong Ok - 1998 - Artificial Intelligence 100 (1-2):177-224.

Reward-respecting subtasks for model-based reinforcement learning.Richard S. Sutton, Marlos C. Machado, G. Zacharias Holland, David Szepesvari, Finbarr Timbers, Brian Tanner & Adam White - 2023 - Artificial Intelligence 324 (C):104001.

Can reinforcement learning learn itself? A reply to 'Reward is enough'.Samuel Allen Alexander - 2021 - Cifma.

Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.Tom Everitt, Marcus Hutter, Ramana Kumar & Victoria Krakovna - 2021 - Synthese 198 (Suppl 27):6435-6467.

Profit Sharing 法における強化関数に関する一考察.Tatsumi Shoji Uemura Wataru - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:197-203.

Discrimination of the reward in learning with partial and continuous reinforcement.Stewart H. Hulse - 1962 - Journal of Experimental Psychology 64 (3):227.

The relation of secondary reward to gradients of reinforcement.Charles C. Perkins Jr - 1947 - Journal of Experimental Psychology 37 (5):377.

罰を回避する合理的政策の学習.坪井創吾宮崎和光 - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16 (2):185-192.

The relation of secondary reinforcement to delayed reward in visual discrimination learning.G. Robert Grice - 1948 - Journal of Experimental Psychology 38 (1):1.

Analytics

Added to PP
2020-12-22

Downloads
11 (#1,146,652)

6 months
9 (#437,808)

Historical graph of downloads

How can I increase my downloads?

Citations of this work

Social is special: A normative framework for teaching with and learning from evaluative feedback.Mark K. Ho, James MacGlashan, Michael L. Littman & Fiery Cushman - 2017 - Cognition 167 (C):91-106.

Add more citations

References found in this work

Learning to act using real-time dynamic programming.Andrew G. Barto, Steven J. Bradtke & Satinder P. Singh - 1995 - Artificial Intelligence 72 (1-2):81-138.

Teachable robots: Understanding human teaching behavior to build more effective robot learners.Andrea L. Thomaz & Cynthia Breazeal - 2008 - Artificial Intelligence 172 (6-7):716-737.

Add more references