Name: Pre-1930 Public Domain Educational Texts for Instruction Tuning
Creator: zachnorton03
Published: 2026-06-19T18:13:15
Keywords: Question-Answer, Text, Pre 1930, Instruction Tuning, Educational Texts, Public Domain

Description

27 public-domain educational texts published before 1930 form this supervised fine-tuning dataset. The texts, sourced from the Internet Archive, span natural science, history, law, philosophy, and grammar, and are written in a question-and-answer catechism format. The dataset was created by zachnorton03 and last updated on June 19, 2026.

Use Cases

Instruction tuning of language models based on the question-and-answer format described.
Training models on historical and formal English language styles based on the pre-1930 texts.
Developing educational chatbots using the structured, pedagogical content from the source materials.
Analyzing the evolution of language and knowledge presentation across disciplines mentioned in the description.

Strengths

Derived from 27 distinct source texts, providing a multi-disciplinary corpus.
Texts are in a structured question-and-answer format, which is naturally suited for instruction tuning.
All source texts are in the public domain, simplifying legal use and redistribution.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count and dataset size are unknown, which may limit suitability assessment.
Data may reflect temporal and disciplinary bias inherent to the selected pre-1930 educational texts.

Provenance

Source: Internet Archive
Collection Method: Derived from 27 public-domain educational texts.
Time Range: Texts published before 1930, spanning the 19th and early 20th centuries.
Freshness: Last updated 2026-06-19 18:28:46; freshness should be verified.

License is unknown; users should verify the license status before use.

Text Question-Answer Pre 1930 Instruction Tuning Educational Texts Public Domain

Pre-1930 Public Domain Educational Texts for Instruction Tuning

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info