Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,964 datasets
September 1984 to November 1985 data collection of temperature and salinity profiles from CTD casts in the Atlantic Ocean aboard the R/V Oceanus. This dataset was created by the University of Rhode Island's Graduate School of Oceanography for the Mediterranean Eddy (MEDDY) experiment. It represents a focused oceanographic campaign to study a specific mesoscale feature.
Four ships collected water depth and temperature profiles across the North and South Atlantic Ocean between August 14 and December 4, 1986. The dataset originates from the World Ocean Circulation Experiment (WOCE) and was submitted by Dr. Reiner Onken of the University of Kiel, Germany. Data is available in the NODC C125 Bathythermograph-XBT-Selected Depths file format.
Over two years of pressure, temperature, and current velocity data were collected from the R/V ENDEAVOR and OCEANUS research vessels. The dataset was submitted by Thomas Shay of the University of North Carolina at Chapel Hill as part of the SYNoptic Ocean Prediction project. Measurements were taken via speed meter casts in the Gulf of Mexico.
Atlantic Ocean water temperature and pressure profiles collected from 1988 to 1995 via the BSH Ship-of-Opportunity Programme. The dataset contributes to the World Ocean Circulation Experiment, with principal investigation led by Dr. Alexander Sy of the Bundesamt für Seeschiffahrt und Hydrographie. It represents a multi-year collection of expendable bathythermograph (XBT) data.
Northeast Pacific Ocean data comprises CTD vertical cast measurements from the R/V New Horizon cruise CaBS7, collected off the California coast. The dataset captures seawater pressure, temperature, and salinity from October 16 to 23, 1987. Dr. Barbara Hickey of the University of Washington led the collection for the Southern California Bight Basin Study.
A movie dataset intended for building recommendation systems and performing data analysis. It originates from the Kaggle platform, but details on its creator, size, and specific contents are unspecified. The last update date is unknown.
Pre-processed data from the Indian Premier League cricket tournament. The dataset is published on Kaggle and appears to be structured for analytical tasks. Its specific contents, such as the number of matches or seasons covered, require verification after download.
MethaneSET provides analysis-ready datasets for methane plume detection from satellite sensors. The dataset is authored by Cesar Aybar, Julio Contreras, David Montero, Miguel D. Mahecha, and Luis Gómez-Chova, with a related paper under review in Scientific Data. It was last updated on the platform on 2026-03-23.
A Word2Vec model with 300-dimensional vectors, fine-tuned from Google News vectors on agricultural research text. It was trained on titles and abstracts from the USDA's PubAg database and titles and descriptions from the Ag Data Commons. The model was created by the Department of Agriculture for a recommendation system and program analysis.
A 5.5 KB Excel file contains the inclusion and exclusion criteria used for a scoping review. The review, authored by Pyi Pyi Phyo and last updated in March 2026, focuses on the illicit tobacco market and related policy impacts. Specific details on the number of rows or columns are not provided.
Culture, real-time PCR, and indirect ELISA test results for cow milk samples, likely related to Brucella abortus. The dataset is 5.5 KB in size and was last updated by Pierre Gontao in March 2026. It is licensed under CC-BY-4.0 and available on figshare.
Posts from the ESP Facebook page, covering a period from December 2013 to July 2019. The dataset is hosted on Kaggle, but detailed metadata such as column descriptions and sample data are unavailable. The content likely includes text, timestamps, and engagement metrics from the social media page.
Multimodal Fake News is a dataset hosted on Kaggle. The dataset likely contains content across multiple data types, such as text and images, related to fake news instances. Specifics regarding its size, creation date, and authorship are not provided in the available metadata.
5.5 KB of data compares multilayer perceptron neural network model predictions for loess compressibility with measured data. The dataset includes error metrics and covers typical study areas in Huocheng and Xinyuan counties, Xinjiang. It was authored by Zhiqi Liu and published under a CC BY 4.0 license.
Xinjiang study areas in Huocheng and Xinyuan counties provide data comparing loess compressibility model predictions with measured values. The dataset includes results from Random Forest, MLP, CART, and SVM models, with error metrics for validation. It is a small dataset at 5.5 KB, authored by Zhiqi Liu and last updated in March 2026.
This dataset compares predicted and measured loess compressibility data from a study in Huocheng and Xinyuan counties, Xinjiang. It contains results from multiple regression and machine learning models, including Random Forest, MLP, CART, and SVM. The file is 5.5 KB in size.
50,000 to 8.2 million followers were gained by the Instagram account 'Vozinha' following a match between Cape Verde and Spain. The dataset likely contains metrics related to this rapid follower growth. It was published on Kaggle, but the author and specific collection method are unknown.
TMDb 11K movie financial data is a collection of approximately 11,000 movie records for ROI and investment analytics. The dataset is sourced from Kaggle, but the author, organization, and last update date are unknown. It likely contains financial metrics for films, though specific columns and file formats are unspecified.
A dataset likely containing text data related to the review of damage claims, potentially for insurance or property assessment. It was published on Kaggle, but its specific origin, size, and creation date are unknown. The dataset's content and structure must be verified after download.
Top Rated Movie Dataset is a collection of movie information and ratings published on Kaggle. The dataset's specific size, columns, and creation date are unknown. Its content likely includes titles and user or critic ratings.