Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,946 datasets
OpenForesight is a dataset of forecasting questions generated from news articles using retrieval-augmented prompts. It is designed to evaluate language models' ability to make predictions about future events using relevant context. The dataset was created by nikhilchandak and was last updated on March 31, 2026.
Air pressure measurements collected by weather sensors deployed on the NERP Weather Station site Badu. The data covers a period from 08 May 2018 to 03 Jun 2021. The dataset is hosted by the Australian Ocean Data Network and was last updated on 22 March 2026.
110 samples of gasipaes fruit mesocarp underwent proximal composition analysis, with results expressed as a percentage of dry matter. The dataset includes calculated mean, standard deviation, and range values for each nutritional component. It was authored by David Romero-Estรฉvez and shared in 2026.
A clinical analysis presents univariate effects and adjusted means for concussion outcomes, comparing migraine and no-migraine groups while controlling for biological sex, anxiety, and depression. The dataset was authored by Katelyn Tourigny and published on figshare in March 2026. Row and column counts are not specified.
vLLM is a project focused on easy, fast, and cheap large language model serving. The dataset, created by Tsuki0512, includes documentation and announcements related to the project's development and community events. The record was last updated on April 12, 2026.
vLLM is an open-source project for fast and efficient large language model serving. The dataset page, last updated on 2026-04-12, includes documentation and announcements from the project's development. It is hosted on Hugging Face by the author Tsuki0512.
Replication Data for: Identity verification standards in welfare programs: experimental evidence from India. The dataset supports a study published in the Review of Economics and Statistics in 2025. It was last updated on April 15, 2026.
Primary survey data collected from followers of a selected TikTok influencer in Indonesia. The dataset includes variables for influencer credibility, purchase intention, and purchase decision, measured using a Likert scale. It was created by Arsya, Aura Meivia Safira and last updated on 2026-04-16.
A review document concerning marine geophysical investigations over the Lord Howe Rise and Norfolk Ridge regions. The document is a legacy product published on data.gov.au by the Australian Ocean Data Network, with a last recorded update in April 2026. No abstract or detailed content metadata is available.
FNSPID is a dataset of financial news articles curated by sifan2026, designed for time-series forecasting research. It was last updated on April 12, 2026. The dataset is intended to support models analyzing market trends and investor sentiment.
20,949 real political news articles scraped from four major Nepali outlets between 2017 and 2025. The corpus likely contains full-text articles covering political events in Nepal over an eight-year period. The author, organization, and license are unknown.
Top rated movies as of 2026 is a dataset published on Kaggle. The title suggests it contains a list of highly-rated films. The specific source, collection method, and data volume are unknown from the provided metadata.
Trending Movies Dataset (2025โ2026) is a collection of movie-related data published on Kaggle. The dataset's specific content and structure are unknown from the provided metadata. Its columns and size are not detailed, requiring verification after download.
Air pressure data collected by weather sensors deployed on the NERP Weather Station site at Thursday Island. The dataset was published by the Australian Ocean Data Network and last updated on 2026-03-22. The data is available in PNG and HTML file formats.
Hardnumerics is an anonymized benchmark dataset submitted for NeurIPS Evaluations & Datasets track review. The dataset likely contains numerical data for evaluating machine learning models. The full benchmark and code package are hosted on HuggingFace under the Hardnumerics/Hardnumerics repository.
Saibai Island air pressure data collected by weather sensors from April 27, 2016, to March 5, 2021. The dataset was gathered by the NERP Weather Station and aggregated by the Australian Ocean Data Network. The last metadata update was recorded on March 22, 2026.
A scientific publication reports the bathymetric expression of the Fitzroy River palaeochannel on the continental shelf of Australia's southern Great Barrier Reef. The study provides new data for characterizing major river responses to sea-level change. It contrasts sediment transport and aggradation conditions with the nearby Burdekin River palaeochannel.
Kaggle hosts a dataset exploring ecological competition. The description suggests it examines how light limitation influences microbial communities to suppress native competitors and promote invasive species dominance. The dataset's author, organization, and specific collection details are not provided.
A two-part report commissioned by the Greater London Authority in 2017 as part of the London Plan review. It provides a photographic record and brief descriptive comments for each of the 61 Assessment Points identified by the London Views Management Framework. The report is organized into volumes covering River Prospects, London Panoramas, Linear Views, and Townscape Views.
Netflix Catalog data provides a listing of movies and television shows. The dataset is hosted on Kaggle, a platform for data science projects. Specific details about the data's size, origin, and update frequency are not provided in the available metadata.