Loading...
Loading...
Legislative text, court decisions, regulatory filings, patents, government contracts, election data
9,661 datasets
This dataset supports the replication of a figure tracking the number of articles on interest groups in American states published in political science and public policy journals from 2000 to 2025. It was created by Alex Garlick for a study on state interest group research.
Kaggle hosts a dataset of legal contracts, likely containing textual documents for analysis. The dataset's specific size, origin, and creation date are not provided in the available metadata. Its content and structure require verification after download.
A dataset published on Kaggle, titled 'ECI Legal Q&A Dataset'. The dataset likely contains legal text structured for question-answering tasks, as suggested by its title and platform tags. Its specific content, size, and origin require verification after download.
A dataset of legal text embeddings, likely designed for use with large language models. The dataset is published on Kaggle and is tagged with topics including Law, Legal Text, and Embeddings. Specific details on the number of embeddings, their source, or creation date are not provided in the available metadata.
BFH_legal_2 is a text dataset published on Kaggle. The dataset's title and platform tags suggest it contains legal and government-related documents. The specific source, size, and content details require verification after download.
A dataset titled 'BFH-legal-3' is hosted on Kaggle. The title suggests a connection to the Swiss Federal Supreme Court (Bundesgericht/BGer), likely containing legal documents or case data. No further metadata, such as size, columns, or license, is provided.
Kaggle hosts the tinyKaggleClaw_output dataset. The dataset likely contains the output or artifacts from a pre-trained machine learning model. Its specific content, scale, and creation details require verification after download.
BSG_legal is a dataset published on Kaggle. The title suggests a focus on legal or government-related information. Metadata is minimal; the specific content, size, and origin require verification after download.
Nemotron-SFT-LoRA-CoT-Selection is a dataset published on Kaggle. The title suggests it is likely used for supervised fine-tuning (SFT) of language models, possibly employing LoRA (Low-Rank Adaptation) and Chain-of-Thought (CoT) selection techniques. Its specific content, size, and authorship are unknown.
Kaggle hosts a dataset titled 'imdb_feature_selection'. The dataset's content likely relates to feature selection techniques applied to data from the Internet Movie Database. The author, organization, and specific details such as row count and file format are unknown.
Structured case records are paired with judicial outcomes for legal judgment prediction. The dataset's volume, creator, and temporal coverage are unspecified. It originates from the Kaggle platform.
FCA COBS Regulatory Corpus is a text dataset from Kaggle, likely containing the Conduct of Business Sourcebook rules published by the UK Financial Conduct Authority. The dataset's specific size, format, and update history are not detailed in the provided metadata. Its primary content appears to be regulatory text intended for analysis.
Indian legal text data likely intended for training or fine-tuning small language models (SLMs). The dataset is hosted on Kaggle, but its specific size, creation date, and author are unknown. Columns and sample data are unavailable, limiting pre-download assessment.
Swiss-legal-assets-v2 is a dataset published on Kaggle, suggesting a focus on legal or governmental assets within Switzerland. The dataset's specific content, such as entity types, asset classifications, or identifiers, must be verified after download due to minimal metadata. Its author, size, and update history are currently unknown.
Data related to carrier selection for VICIdial/Asterisk VoIP systems. The dataset includes information on SIP trunk providers, their rates, and call quality metrics. It originates from the VICIdial open-source telephony project.
The replication package for a field experiment on antitrust compliance, accepted for publication in the Journal of Political Economy in 2026. It supports the replication of findings from a study investigating regulatory compliance and collusion in auctions. The specific number of rows, columns, and file formats are not detailed in the input.
Aggregating replication files for a study on identity and interim appointments to elected state supreme courts. The data supports analysis of judicial selection processes across U.S. states. Specific details on rows, columns, and file formats are not provided in the input.
Portugal's 2019 parliamentary election results, likely collected in real-time as votes were counted. The dataset originates from the UCI Machine Learning Repository, a known source for academic datasets. It documents the electoral outcome for the national legislature.
A tabular dataset containing business contact information for lawyers in the United States. It is tagged for use in data science projects, though specific details on record count and data fields are not provided.
Featuring survey data from AARP on public awareness of Medicare prescription drug changes. It addresses topics including prescription drug affordability, a new prescription drug pricing law, and Medicare Part D.