Off-campus access
Using PhilPapers from home?
Click here to configure this browser for off-campus access.
- Brian Skyrms, Time to Absorption in Discounted Reinforcement Models.Reinforcement schemes are a class of non-Markovian stochastic processes. Their non-Markovian nature allows them to model some kind of memory of the past. One subclass of such models are those in which the past is exponentially discounted or forgotten. Often, models in this subclass have the property of becoming trapped with probability 1 in some degenerate state. While previous work has concentrated on such limit results, we concentrate here on a contrary effect, namely that the time to become trapped may increase exponentially in 1/x as the discount rate, 1− x, approaches 1. As a result, the time to become trapped may easily exceed the lifetime of the simulation or of the physical data being modeled. In such a case, the quasi-stationary behavior is more germane. We apply our results to a model of social network formation based on ternary (three-person) interactions with uniform positive reinforcement.No categories
Similar books and articles
Recent models in quantum cosmology make use of the concept of imaginary time. These models all conjecture a join between regions of imaginary time and regions of real time. We examine the model of James Hartle and Stephen Hawking to argue that the various no-boundary attempts to interpret the transition from imaginary to real time in a logically consistent and physically significant way all fail. We believe this conclusion also applies to quantum tunneling models, such as that proposed by Alexander Vilenkin. We conclude, therefore, that the notion of emerging from imaginary time is incoherent. A consequence of this conclusion seems to be that the whole class of cosmological models appealing to imaginary time is thereby refuted.
The “dynamic developmental” theory of attention-deficit/hyperactivity disorder (ADHD) has come full circle from Wender's (1971) reinforcement hypothesis. By specifying the principle of time constraints on reinforcement and extinction, the present theory allows for empirical validation. However, the theory implies, but does not discuss, implications for the neurophysiology of comorbidity in ADHD. The authors' attribution of comorbid oppositional behavior to parental and societal reinforcement leaves out biological factors.
No categories
Notwithstanding the many strengths of the dynamic developmental theory, there remain challenges to be overcome before it can be incorporated into a true causal model of attention-deficit/hyperactivity disorder (ADHD). These include the development of reliable measures of reinforcement delay gradients, the validation of shortened reinforcement delay as an endophenotype, and the integration of this pathway with other potential pathways.
No categories
chical reinforcement learning that does not rely on a pri ori hierarchical structures Thus the approach deals with a more di cult problem compared with existing work It in volves learning to segment sequences to create hierarchical structures based on reinforcement received during task ex ecution with di erent levels of control communicating with each other through sharing reinforcement estimates obtained by each others The algorithm segments sequences to re duce non Markovian temporal dependencies to facilitate the learning of the overall task Initial experiments demon strated the basic promise of the approach..
No categories
Rachlin rightly highlights behavioural reinforcement, conditional cooperation, and framing. However, genes may explain part of the variance in altruistic behaviour. Framing cannot be used to support his theory of altruism. Reinforcement of acts is not identical to reinforcement of patterns of acts. Further, many patterns of acts could be reinforced, and Rachlin's altruism is not the most likely candidate.
The concept of “intrinsic reinforcement” stretches the use of “reinforcement” beyond where it is valuable. The concept of the “self-system,” though fuzzy at the edges, can cover experience as well as the behaviour of altruistic acts.
No categories
The question of whether time is its own best representation is explored. Though there is theoretical debate between proponents of internal models and embedded cognition proponents (e.g. Brooks R 1991 Artificial Intelligence 47 139–59) concerning whether the world is its own best model, proponents of internal models are often content to let time be its own best representation. This happens via the time update of the model that simply allows the model’s state to evolve along with the state of the modeled domain. I argue that this is neither necessary nor advisable. I show that this is not necessary by describing how internal modeling approaches can be generalized to schemes that explicitly represent time by maintaining trajectory estimates rather than state estimates. Though there are a variety of ways this could be done, I illustrate the proposal with a scheme that combines filtering, smoothing and prediction to maintain an estimate of the modeled domain’s trajectory over time. I show that letting time be its own representation is not advisable by showing how trajectory estimation schemes can provide accounts of temporal illusions, such as apparent motion, that pose serious difficulties for any scheme that lets time be its own representation.
In this paper I propose a reinforcement learning model for a predator preying upon two types of prey, the unpalatable (noxious) models, and the palatable mimics. The latter type of prey resembles the models in appearance so as to derive some protection from the predator who must avoid the unpalatable models. Essentially the predator is treated as a learning automaton adopting a simple reinforcement learning strategy in order to increase its consumption of palatable prey and reduce the consumption of unpalatable ones. The populations of both mimics and models are assumed to grow logistically.
Altruism can be understood in terms of traditional principles of reinforcement if an outcome that is beneficial to another person reinforces the behavior of the actor who produces it. This account depends on a generalization of reinforcement across persons and might be more amenable to experimental investigation than the one proposed by Rachlin.
No categories
We investigate a simple stochastic model of social network formation by the process of reinforcement learning with discounting of the past. In the limit, for any value of the discounting parameter, small, stable cliques are formed. However, the time it takes to reach the limiting state in which cliques have formed is very sensitive to the discounting parameter. Depending on this value, the limiting result may or may not be a good predictor for realistic observation times.
No categories
Discussion of Brian Skyrms, Time to absorption in discounted reinforcement models
|
|
There are no threads in this forum |
Nothing in this forum yet.

