Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,956 datasets
Giving access to line representations of designated Wild and Scenic River corridor boundaries managed by the Bureau of Land Management. It covers rivers within Oregon and Washington, established under the National Wild and Scenic Rivers System created by Congress in 1968.
Replication data for a study published in the American Political Science Review. The dataset likely contains variables related to legislative effectiveness, ambition, and electoral outcomes. Coauthors Danielle M. Thomsen, Sarah A. Treul, Craig Volden, and Alan E. Wiseman produced the data.
News RSS is a text dataset published on HuggingFace by user blaze-aura69. The dataset likely contains content aggregated from RSS news feeds. Its last recorded update was scheduled for 2026-05-15 15:17:00.
Comprehensive Nigerian News is a dataset of news articles from Nigeria, published on Kaggle. The dataset's specific size, source, and time period are unknown. Its content likely covers various topics relevant to Nigerian media and society.
Netflix titles listed on the Kaggle platform. The dataset likely contains a catalog of movies and TV shows available on the streaming service. Specific details on the number of records, features, and update frequency are not provided in the available metadata.
Amazon product reviews published on Kaggle. The dataset likely contains user-generated text feedback on items sold on the Amazon marketplace. Specific details such as the number of reviews, time period, and included metadata are not provided in the minimal input.
Trending movies data published on the Kaggle platform. The dataset's specific content, such as movie titles, ratings, or release dates, is not detailed in the available metadata. Its size, structure, and collection methodology are unknown.
AG News likely contains news articles for text classification tasks. The dataset is published on Kaggle, but its specific size, creation date, and author are unknown. Columns suggest it contains textual data, potentially with category labels.
Kaggle hosts a dataset titled 'top_rated_movies'. The dataset likely contains information about films with high user or critic ratings. Its specific contents, such as titles, ratings, genres, or release years, require verification after download.
Surface underway measurements of carbon dioxide partial pressure, temperature, salinity, and wind collected during the Meteor36/2 cruise from June to July 1996. The data were collected by Arne Körtzinger of GEOMAR Helmholtz Centre for Ocean Research Kiel using a Carbon dioxide (CO2) gas analyzer and other instruments. Observations cover the North Atlantic Ocean and North Sea.
NCEI Accession 0156927 includes surface underway chemical, meteorological, and physical data collected from the RYOFU MARU research vessel across the Bismarck Sea, East China Sea, Japan Sea, North Pacific Ocean, Philippine Sea, and South Pacific Ocean from 1989-11-17 to 1995-03-07. The dataset contains measurements of BAROMETRIC PRESSURE, Partial pressure (or fugacity) of carbon dioxide in the atmosphere and water, SALINITY, and SEA SURFACE TEMPERATURE. These data were collected by Shu Saito of the Japan Meteorological Agency using a Carbon dioxide (CO2) gas analyzer.
NCEI Accession 0157100 contains surface underway chemical, meteorological, and physical data collected by the research vessel MARION DUFRESNE across multiple seas and oceans from January 1991 to August 1993. The dataset includes measurements of the partial pressure of carbon dioxide, salinity, sea surface temperature, and barometric pressure, collected using a Carbon dioxide (CO2) gas analyzer. These data were collected by Alain Poisson of Universite Pierre et Marie Curie as part of the Minerve 07-28 cruise.
NCEI Accession 0157739 contains surface underway observations of carbon dioxide partial pressure and other variables collected in the North and South Atlantic Ocean from June to September 1997. The data were collected by researchers from Universite Pierre et Marie Curie and the University of East Anglia using carbon dioxide gas analyzers and equilibrators. These measurements are part of the CARIOCA Buoy 1997 fCO2 Data set.
TikTok-10M is a large-scale dataset containing 10 million short-form posts from TikTok, designed for video understanding, multimodal learning, and social media content analysis. The dataset was curated by larlarHF to bridge the gap between academic video datasets and actual user-generated content, providing researchers with authentic patterns of modern short-form video. It was last updated on Hugging Face on March 24, 2026.
A collection of movie titles and associated ratings, likely sourced from user reviews or critical scores. The dataset is hosted on Kaggle, a platform for data science projects. Specific details regarding the number of records, time period, and exact rating criteria are not provided in the available metadata.
TMDB_movies is a dataset hosted on Kaggle. Its title suggests it contains information about movies, likely sourced from The Movie Database. Specific details regarding its size, columns, and creation date are unavailable from the provided metadata.
A curated collection of top-rated movies sourced from The Movie Database (TMDB) API. The dataset was compiled by an unknown author and is hosted on Kaggle. The specific number of records, time coverage, and update frequency are not provided.
XBT (bathythermograph) data collected from multiple ships between January 1, 1983 and December 31, 1992. The data was transmitted to the Atlantic Oceanographic Meteorological Laboratory (AOML) in Miami, FL, and later transferred to the National Oceanographic Data Center (NODC). The original submission and processed groupings are included in the accession.
61 attributes summarize features of articles published by Mashable over a two-year period. The dataset, sourced from UCI, aims to predict the number of shares in social networks. It was created by researchers from INESC TEC and Universidade do Minho, with acquisition noted on January 8, 2015.
OnlineNewsPopularity is a tabular regression dataset from the UCI repository summarizing 61 features for articles published by Mashable over a two-year period. The goal is to predict the number of social media shares, with features including word counts, link counts, and content channel indicators. The dataset was created by researchers from INESC TEC and Universidade do Minho for a 2015 conference on artificial intelligence.