Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,939 datasets
Archived documents detail bilateral agreements and exchanges of notes between Canada, the United States, and Mexico concerning cultural cooperation. These instruments establish frameworks for film and television co-production and regulate advertising services in periodicals. Global Affairs Canada published this collection, which was last updated in March 2026 but is noted as out of date for reference purposes only.
Ziegler, Daniela compiled search strategies for multiple biomedical databases, including MEDLINE, EMBASE, and Scopus, focusing on "subarachnoid hemorrhage" and "blood biomarkers". The dataset, harvested from Borealis Dataverse, was last updated on 2026-05-02. It contains the documented search methodology for a systematic review on predicting outcomes and detecting complications.
The dataset from Princeton University researcher Michael Thaler supports a study on motivated reasoning in belief formation. It contains experimental data analyzing how subjects assess the veracity of information sources that contradict their preconceived beliefs. Evidence covers topics including immigration, income mobility, crime, racial discrimination, gender, climate change, and gun laws.
A CSV file contains statistical contrast estimates between brain regions of interest (ROIs). The data reports significant two-sided t-test results with Bonferroni correction and Kenward-Roger degrees of freedom. Contrasts are expressed as differences of log-scaled values.
Neuroscience data contains Bonferroni-corrected two-sided t contrasts between ROIs, using Kenward–Roger’s method for degrees of freedom. Contrast estimates are expressed as differences of square-root scaled values, with only significant contrasts reported. The dataset is a small 8.2 KB CSV file authored by Valeria Centanino.
Statistical results from two-sided t contrasts between streams, using Bonferroni correction and Kenward–Roger’s method for degrees of freedom. Contrast estimates are expressed as differences of square-root scaled values. The dataset is provided by author Valeria Centanino in a 3.4 KB CSV file.
Valeria Centanino provides a statistical analysis of brain region duration preferences, published in March 2026. The dataset contains Bonferroni-corrected two-sided t contrasts, with estimates expressed as differences of square-root scaled values. Only significant contrasts are reported in this 9.4 KB CSV file.
A structured CSV dataset of news articles intended for machine learning, NLP, and data science applications. The dataset's author, organization, and specific collection details are unknown. Its last update date and size are also unspecified.
One of the largest conversation-level preference datasets in French, collected via the Compar:IA chatbot arena. The dataset was developed by the French Ministry of Culture to improve French conversational AI and study model pluralism. It was last updated in April 2026.
Archived documents compile formal bilateral agreements between Canada and Finland. The collection includes treaties on atomic energy cooperation with annexes, exchanges of letters on nuclear material, air services agreements with amendments and annexes, a film and television coproduction agreement, and a social security protocol amendment. This archived compilation was published by Global Affairs Canada and last updated in February 2026.
CEQR Open Data contains information on projects undergoing or having completed review through the City Environmental Quality Review (CEQR) process. The dataset, provided by the City of New York, includes project names, descriptions, lead agencies, milestones, and geographical locations for filings from January 1, 2005 to the present. Associated documents are linked to the CEQR Access Database.
CEQR Open Data contains information on projects undergoing or having completed the City Environmental Quality Review process, filed from January 1, 2005 to the present. The dataset includes project names, descriptions, lead agencies, milestones, and geographical locations. It is published by the City of New York and was last updated on March 22, 2026.
New York City's CEQR Open Data contains information on projects undergoing or having completed the City Environmental Quality Review process. The data includes project names, descriptions, lead agencies, milestones, and geographical locations for projects filed from January 1, 2005 to the present. It is published by the City of New York on the datagov platform.
1432 survey records from a study on life conditions in rural zones, focusing on depression. The dataset includes 23 columns covering demographic, family, asset, income, and expense information, with a binary target variable for depression status. It was originally published by Frankcc on Kaggle and curated by Diego Babativa.
1432 survey records from a study on life conditions in rural zones, focusing on depression. The dataset includes 23 columns covering demographic, family, asset, income, and expense information, with a binary target variable for depression status. It was originally published by Frankcc on Kaggle and curated by Diego Babativa.
16,531 records from surveys commissioned by Scottish Natural Heritage detail marine habitats and species around Scotland's coast. The data were collected primarily between 1996 and 1998 to establish a baseline for the Marine Nature Conservation Review program. It includes information from 780 sites and 1,590 samples.
13,769 records from 226 sites document benthic marine habitats and species around the coast of England. English Nature commissioned and collected this data using semi-quantitative SACFOR abundance scales for epibiota. The dataset was contributed to the Marine Nature Conservation Review program, with a last recorded update in 1999.
EgoAVU is a dataset for egocentric audio-visual understanding, introduced by Facebook and highlighted at CVPR 2026. The data engine enriches existing egocentric narrations by integrating human actions with environmental context, linking visible objects and sounds. The dataset was last updated on the platform on April 9, 2026.
Meta_Album_SPT_Extended is a preprocessed version of the 100-Sports image classification dataset, containing 10,416 images across 73 different sports. The dataset was originally compiled by Gerald Piosenka from internet searches and preprocessed for the Meta-Album benchmark by Jilin He in March 2022. Images are resized to 128x128 pixels and are provided under a CC0 1.0 Public Domain license.
A dataset for training summarization models on Russian-language film reviews. It contains full-text reviews from the Kinopoisk website paired with generated short summaries. The dataset was created by Auttar and was last updated on March 29, 2026.