Reinforcement Learning Model Fitting for Bandit Behavioral Data

Name: Reinforcement Learning Model Fitting for Bandit Behavioral Data
Creator: Hao Zhu
Published: 2026-03-26T05:15:46
License: CC-BY-4.0
Keywords: Convex Relaxation, Convex Optimization, Behavior Modeling, Reinforcement Learning, Multi Armed Bandits

by Hao ZhuUpdated 3mo ago

213.6 KB1files

Available on 1 platform

Sign in to view source links and access this dataset

Description

A 2026 research paper by Hao Zhu introduces a convex optimization method for fitting reinforcement learning models to behavioral data in multi-armed bandit environments. The work includes an open-source Python package and is evaluated in simulated and real-world environments.

Use Cases

Implement the provided convex relaxation method to fit RL models to behavioral data from multi-armed bandit experiments.
Use the open-source Python package to analyze decision-making behavior in simulated bandit environments.
Benchmark the proposed fitting method's computation time against state-of-the-art techniques from the literature.

Strengths

Includes an open-source Python package for direct application by researchers.
Theoretical analysis provides convexity properties for a wide range of RL models.
Method evaluated in several simulated and real-world bandit environments.

Limitations

The dataset is a 213.6 KB PDF document, containing a research paper rather than structured behavioral data.
No raw behavioral data, column definitions, or sample records are provided for direct machine learning use.

Provenance

Source: figshare, author Hao Zhu.
Collection Method: Research paper detailing a novel computational method.
Time Range: null
Freshness: Last updated March 2026.
Geography: null

File is a PDF research paper, not a tabular dataset. The 213.6 KB size indicates limited scope. License is CC BY 4.0.

Convex Relaxation Convex Optimization Behavior Modeling Reinforcement Learning Multi Armed Bandits

Related Datasets

Quality Score

C46

Description

59

Source

38

Reputation

35

Access

52

Community

0 views

Dataset Info

License: CC-BY-4.0
Author: Hao Zhu
Files: 1
Created: Mar 26, 2026
Updated: Mar 26, 2026
DOI

Access

52

Community

0 views

Dataset Info

License: CC-BY-4.0
Author: Hao Zhu
Files: 1
Created: Mar 26, 2026
Updated: Mar 26, 2026
DOI

Reinforcement Learning Model Fitting for Bandit Behavioral Data

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info