Results 1 to 10 of about 105,939 (137)
Reinforcement Learning with Non-Exponential Discounting [PDF]
Commonly in reinforcement learning (RL), rewards are discounted over time using an exponential function to model time preference, thereby bounding the expected long-term reward. In contrast, in economics and psychology, it has been shown that humans often adopt a hyperbolic discounting scheme, which is optimal when a specific task termination time ...
Schultheis, Matthias +2 more
semanticscholar +6 more sources
Controlled Markov chains with non-exponential discounting and distribution-dependent costs [PDF]
This paper deals with a controlled Markov chain in continuous time with a non-exponential discounting and distribution-dependent cost functional. A definition of closed-loop equilibrium is given and its existence and uniqueness are established. Due to the time-inconsistency brought by the non-exponential discounting and distribution dependence, it is ...
Mei, Hongwei, Yin, George
semanticscholar +3 more sources
UGAE: A Novel Approach to Non-exponential Discounting [PDF]
The discounting mechanism in Reinforcement Learning determines the relative importance of future and present rewards. While exponential discounting is widely used in practice, non-exponential discounting methods that align with human behavior are often desirable for creating human-like agents.
Kwiatkowski, Ariel +3 more
semanticscholar +5 more sources
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Zhong, W, Zhao, Y, Chen, P
semanticscholar +6 more sources
Observational Implications of Non-Exponential Discounting [PDF]
Dans cette note, nous donnons un exemple élémentaire avec une fonction d’utilité concave où la règle d’épargne essentiellement unique et consistante est discontinue ; une telle règle d’épargne ne peut pas survenir avec une fonction d’escompte exponentielle.
Morris, Stephen, Postlewaite, Andrew
semanticscholar +6 more sources
Robust optimal consumption-investment strategy with non-exponential discounting
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Wei, Jiaqin, Li, Danping, Zeng, Yan
semanticscholar +4 more sources
Non-exponential discounting portfolio management with habit formation
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Liu, J, Lin, L, Yiu, KFC, Wei, J
semanticscholar +5 more sources
Partial Identifiability in Inverse Reinforcement Learning for Agents with Non-Exponential Discounting [PDF]
The aim of inverse reinforcement learning (IRL) is to infer an agent's preferences from observing their behaviour. Usually, preferences are modelled as a reward function, R, and behaviour is modelled as a policy, pi. One of the central difficulties in IRL is that multiple preferences may lead to the same observed behaviour.
Skalse, Joar, Abate, Alessandro
semanticscholar +6 more sources
Non-exponential Reward Discounting in Reinforcement Learning
Reinforcement learning methods typically discount future rewards using an exponential scheme to achieve theoretical convergence guarantees. Studies from neuroscience, psychology, and economics suggest that human and animal behavior is better captured by the hyperbolic discounting model.
Raja Farrukh Ali
semanticscholar +4 more sources
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Ishak Alia
semanticscholar +5 more sources

