Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,972 datasets
Cultural reach and acceptance metrics likely related to international dissemination. The dataset is published on Kaggle. The specific scope, size, and creation details are unknown from the provided input.
Frame-level soccer video shot category data published on Kaggle. The dataset likely contains video frames annotated with categories of shots, such as goals, saves, or misses. Specific details on the number of frames, source, and collection period are unavailable from the provided metadata.
padding-attack-dataset-facebook-2023-12-05 is a dataset of network traffic related to the Bleichenbacher padding attack, created on 2023-12-05 with server facebook. The dataset was authored by Jan Peter Drees and Dennis Funke and is available on the OpenML platform. It contains TCP-level features from multiple network flows, likely captured during attack simulations.
9.5 KB of data from figshare identifies protein domains disproportionately encoded in genes with intestine-biased expression in the parasitic hookworm Ancylostoma ceylanicum. Author Erich M. Schwarz published this dataset under a CC-BY-4.0 license, with a last update recorded as 2026-03 17. The data likely supports research into novel therapeutic targets for intestinal parasitic nematodes, which infect hundreds of millions of people.
Sediment samples from six ponds on Bratina Island, McMurdo Ice Shelf, Antarctica, were analyzed over a four-year study. The dataset likely contains measurements of methane production rates, turnover of 14C-labeled acetate, and degradation of added substrates like proteins and polysaccharides. Data collection involved in-situ incubations and laboratory analyses, with samples returned to New Zealand for further study.
University of Leeds researchers adapted and validated the Systemic Sclerosis Quality of Life Questionnaire (SScQoL) for use across seven European countries. The dataset contains translated and culturally validated instrument versions for the UK, France, Italy, Spain, Sweden, Germany, and Poland. Dr. Anthony Redmond and colleagues conducted this cross-cultural validation study to create a common measure of quality of life for systemic sclerosis patients.
A 7,600-square mile area of south Florida is simulated by the Natural System Model to estimate pre-drainage hydropatterns. The model was developed to support a multimillion-dollar interagency restoration effort for the Everglades ecosystem. The dataset originates from a review conducted by CEOS_EXTRA, with a last update recorded on 1996-09-30.
Seismic line and map images were acquired by the U.S. Minerals Management Service in 1976 from the Gulf of Mexico and Atlantic Outer Continental Shelves. The data was released on CD-ROMs 25 years after submission. The organization SCIOPS is associated with this dataset.
April 1969 to February 1970 data from the Nimbus-3 Medium Resolution Infrared Radiometer (MRIR) experiment, archived by the GES DISC. The dataset consists of photographic film sheet images containing entire daylight orbit segments of brightness temperatures across five specific wavelength bands. It was produced to study Earth's heat balance, water vapor distribution, and surface temperatures.
Five wavelength bands of infrared brightness temperature data were captured by the Nimbus-2 Medium Resolution Infrared Radiometer (MRIR) instrument. The dataset consists of photographic film sheets, each containing a full daylight orbit, archived as JPEG 2000 digital files. Data collection was performed by NASA's GES DISC from the satellite's launch on May 15, 1966, until recorder failure on July 29, 1966.
A dataset from the Department of Homeland Security containing records of quality assurance reviews for cases. The dataset was last updated on March 22, 2026. Its specific content and scale are described as a collection of users' quality assurance reviews.
News_dataset is a text collection hosted on Kaggle. The dataset's specific content, size, and origin are not detailed in the available metadata. Users must download the dataset to verify its scope, structure, and potential applications.
Kaggle hosts a dataset related to movies. The specific content, size, and origin are not detailed in the provided metadata. Users should inspect the data after download to confirm its scope and utility.
Grid-square counts of Weddell seal pups in Antarctica's Vestfold Hills, aggregated over 24 years. The dataset contains 3,795 total pups recorded between 1973 and 1999, excluding three years with minimal data. Data originates from the Australian Antarctic Division's TAGS database.
Moored instrument data from the Synoptic Ocean Prediction Experiment captures the physics of Gulf Stream meandering and ring interactions. The dataset includes observations from the Inlet Array near Cape Hatteras and the Central Array near 68°W, collected between 1987 and 1990. The program was maintained by the URI, UNC, and UM research groups.
Seismic map and line images were released by the U.S. Minerals Management Service from surveys conducted in 1976. The data covers the Atlantic Outer Continental Shelf, specifically from seismic films RE CD01-6A, 6B, and 6C. This public release occurred 25 years after the information was originally submitted to the agency.
Self-collected data from the TMDB API includes movie details, ratings, revenue, and genres. The dataset's exact size, temporal coverage, and author are not specified in the provided metadata. Its contents are likely structured as tabular records based on the description of its features.
IMDb Favourite Movies Dataset contains metadata on user-selected favourite movies from the IMDb platform. The dataset's author, organization, and specific size are unknown. It was sourced from Kaggle, but its last update date is not provided.
movielens-20m-dataset is a collection of movie ratings and tags published on Kaggle. The dataset likely contains user-movie interactions, which are foundational for building recommender systems. Its specific scale and collection methodology are not detailed in the provided metadata.
A dataset hosted on Kaggle concerning depression detection. The title suggests it contains text posts from the Reddit platform, likely intended for training or evaluating models related to mental health. The author, organization, and specific collection details are unknown.