A collection of 945 uniquely annotated HTTP Archive (HAR) files captured during controlled sessions with three conversational AI platforms. The dataset supports the Human-AI-Platform Attribution Framework and includes artefacts from seven forensic scenarios, such as multi-turn conversations and adversarial prompts. It was authored by Prathmesh Pawar and last updated on June 4, 2026.
Use Cases
- Developing attribution models for AI-generated content based on annotated actor labels and scores.
- Studying platform-level behavior and API interactions based on HTTP Archive files from controlled sessions.
- Benchmarking forensic analysis techniques for conversational AI using the seven defined experiment scenarios.
- Investigating adversarial prompt responses based on the included adversarial prompt interaction artefacts.
Strengths
- Contains 945 unique artefacts with actor attribution labels (Human, AI Model, Platform).
- Covers three major AI platforms: OpenAI ChatGPT, Anthropic Claude, and Google Gemini.
- Includes artefacts from seven distinct forensic scenarios, such as file uploads and web-search-triggered queries.
- Artefacts are annotated with Attribution Priority Index (API) values and Actor Attribution Scores (AAS).
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- Harvard Dataverse
- Collection Method
- HTTP Archive (HAR) files captured during controlled conversational AI sessions.
- Freshness
- Last updated 2026-06-04 14:11:41; freshness should be verified.