Estimating Consistent Reward of Expert in Multiple Dynamics via Linear Programming Inverse Reinforcement Learning状態遷移確率の異なるMDP環境間で無矛盾な報酬の推定法

Abstract This article has no associated abstract. (fix it)
Keywords No keywords specified (fix it)
Categories No categories specified
(categorize this paper)
DOI 10.1527/tjsai.b-j23
Edit this record
Mark as duplicate
Export citation
Find it on Scholar
Request removal from index
Revision history

Download options

Our Archive

Upload a copy of this paper     Check publisher's policy     Papers currently archived: 46,461
External links

Setup an account with your affiliations in order to access resources via your University's proxy server
Configure custom proxy (use this if your affiliation does not provide a proxy)
Through your library

References found in this work BETA

No references found.

Add more references

Citations of this work BETA

No citations found.

Add more citations

Similar books and articles

The Relation of Secondary Reward to Gradients of Reinforcement.Charles C. Perkins Jr - 1947 - Journal of Experimental Psychology 37 (5):377.
Effect of Reinforcement Schedules on Reward Shifts.P. J. Mikulka, R. Lehr & W. B. Pavlik - 1967 - Journal of Experimental Psychology 74 (1):57-61.
Process-Algebraic Interpretations of Positive Linear and Relevant Logics.Mads Dam - 1992 - LFCS, Department of Computer Science, University of Edinburgh.
Linear Logic Automata.Max I. Kanovich - 1996 - Annals of Pure and Applied Logic 78 (1-3):147-188.


Added to PP index

Total views
2 ( #1,299,592 of 2,286,400 )

Recent downloads (6 months)
2 ( #582,474 of 2,286,400 )

How can I increase my downloads?


My notes

Sign in to use this feature