Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A 2026 research paper by Hao Zhu introduces a convex optimization method for fitting reinforcement learning models to behavioral data in multi-armed bandit environments. The work includes an open-source Python package and is evaluated in simulated and real-world environments.
File is a PDF research paper, not a tabular dataset. The 213.6 KB size indicates limited scope. License is CC BY 4.0.