Sign in to view source links and access this dataset
Description
A collection of skills for mechanistic interpretability analysis of large language models, including refusal geometry extraction and boundary surface mapping. The dataset is authored by bedderautomation and was last updated on March 11, 2026. It is designed for use with Claude Code, OpenAI Codex, and Gemini CLI, supporting the agentskills.io standard.
Use Cases
Apply the refusal-geometry skill to extract and analyze refusal cone geometry from open-weight transformer models.
Use boundary surface mapping skills to investigate model decision boundaries for mechanistic analysis.
Leverage self-referential mechanistic analysis skills to probe internal representations of transformer models.
Strengths
Last updated on March 11, 2026, indicating recent maintenance.
Includes a 6-stage extraction pipeline for refusal geometry analysis.
Compatible with multiple major language model tools (Claude Code, OpenAI Codex, Gemini CLI).
Limitations
No sample data, column definitions, or size information is provided, limiting initial assessment.
The dataset's structure, row count, and specific data format are unknown.
Provenance
Source
huggingface
Freshness
Last updated on March 11, 2026.
The full description is truncated; users must visit the dataset page on Hugging Face for complete details. License information is unknown.