Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,964 datasets
Social Security Administration data capturing reporting and management information for return-to-work activities. The dataset is used to make decisions on Continuing Disability Reviews (CDRs) for disability beneficiaries. It was last updated on April 3, 2026.
Supplementary material from a study investigating Haldane's rule in the developing brain of hybrid mice. The dataset, published on figshare by Lena Arévalo under a CC-BY-4.0 license, likely contains data related to X-linked and imprinted genes contributing to transgressive expression. The last update was recorded on 2026-04-21 05:08:07.
178,000 records of products repaired at community repair cafes globally. The data documents efforts to combat throwaway culture by extending product lifespans. The author and specific collection dates are unknown.
Top Rated Movies 2026 is a dataset published on Kaggle. Its title suggests it contains information about highly-rated films spanning all genres up to the year 2026. The raw description indicates it covers movies 'Till 2026,' which likely implies a forward-looking projection or compilation.
Historical physical-chemical oceanographic data from discrete depth levels, collected primarily via multi-bottle Nansen casts and some electronic CTD/STD recorders. The dataset was compiled by NOAA's National Oceanographic Data Center (NODC) and contains observations from 1898 through 1996. Cruise information, position, date, time, and principal parameters like temperature and salinity are recorded for each station.
A catalog of movies and TV shows available on the Netflix streaming platform. The dataset is hosted on Kaggle, but its specific size, creation date, and authorship are unknown. Columns and detailed metadata are not provided in the input.
Prices, ratings, and brands for creatine, protein, and citrulline supplements from a Russian e-commerce platform. The dataset is hosted on Kaggle, but the author, organization, and last update date are unknown. The specific number of rows, columns, and file formats are also unspecified.
A dataset of 1503 records collected from a medical hospital via a Google form questionnaire. The unpublished dataset contains 15 attributes, with 10 selected for this release, including 9 analysis features and one target attribute. The target attribute 'Feeling Anxious' was chosen as a predictor for postpartum depression.
A dataset from the OpenML platform with the identifier 'colleges_usnews'. No information is available on its contents, size, or structure.
TWINS 1 neutral-atom spectrometer level-1 magnetospheric images and image movies at full resolution, with 9 energy steps and 15-minute temporal windows. Images are smoothed for uniform statistics and have a 4x4 degree angular resolution. The data is derived from direct events measurements of energetic neutral atoms (ENAs) by the National Aeronautics and Space Administration.
Australia's offshore areas released for greenhouse gas geological storage are described in this document. The article was written as a contribution to the IEA GHG R&D Programme's quarterly newsletter for publication in June 2009. It was authored by Geoscience Australia Data and published on the data.gov.au platform.
The Observatory of Conflicts dataset systematically records events where actors publicly express collective demands against institutions, based on a UNDP-inspired definition. Data is gathered through a systematic review of a wide selection of press media to mitigate bias, structured into unique events with relevant variables. The dataset is produced by the Centre For Social Conflict Reproducible Research at Adolfo Ibáñez University.
Video frames encoded using the VVC (Versatile Video Coding) standard's All Intra mode at Quantization Parameter 37. The data was created by the VVC reference software, VTM version 24.0, for use in video compression research.
A dataset titled 'Raw_2ong_bbc_twitter' hosted on Kaggle. The dataset likely contains text content related to social media and news. Metadata is minimal; actual content requires verification after download.
BENI v1.0 is a harmonized dataset of Bangla news articles designed for measuring economic narratives. The collection spans a decade from 2014 to 2024. It is hosted on Kaggle and is tagged for text analysis, economics, and finance.
Trending Movies over the Years is a dataset hosted on Kaggle. The title suggests it contains information about movie popularity across different years. The dataset's specific content, scale, and origin are not detailed in the available metadata.
A resampled subset of the ICASSP 2022 DNS Challenge dataset, containing clean speech, environmental noise, and room impulse responses. All audio files are resampled from 48kHz to 16kHz and stored in lossless FLAC format, packed into tar shards. The dataset was created by user 'richiejp' and was last updated on March 22, 2026.
TWINS 1 neutral-atom spectrometer level-1 images and movies of the magnetosphere at full resolution. The data is produced by NASA and was last updated in March 2026. Each image has 4x4 degree angular resolution and represents 15 minutes of data, smoothed for uniform statistics.
100 top-rated movies from IMDb, cleaned and structured after being fetched via RapidAPI. The dataset's author, organization, and specific update date are unknown. Its exact size, row count, and file formats are also unspecified.
A dataset related to movies, published on the Kaggle platform. The specific contents, scale, and origin are not detailed in the available metadata. Users must download the dataset to verify its exact structure and suitability for their tasks.