Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
This synthetic text dataset contains 4,000 examples designed for AI safety research into backdoor persistence, published by fremko in 2026. It features 2,800 normal interactions and 1,200 triggered interactions where the assistant responds with a specific hostile phrase.
Distributed under the Apache 2.0 license; requires JSON parsing capabilities.