Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,972 datasets
39 files contain chlorophyll and nutrient data from the Bering Sea's Inner Shelf Transfer and Recycling program. Measurements were collected from multiple ship platforms over a decade by Dr. C.P. McRoy of the University of Alaska Fairbanks. The dataset includes Conductivity, Temperature, Depth (CTD) profiles alongside biological and chemical samples.
Drake Passage surface data collected during the First Dynamic Response And Kinematic Experiment (FDRAKE) in 1976 aboard the YELCHO. The dataset includes surface temperature, salinity, and silicate measurements from two separate legs spanning February 27 to April 8, 1976. Data was submitted by the Department of Oceanography at Texas A&M University and archived by NOAA NCEI.
1972-01-01 to 1974-06-30 tidal information for Chesapeake Bay. Data includes half-hourly measurements of tidal height, latitude, longitude, date, and time from multiple stations. It was submitted by Saul Berkman of the NOS Tides Branch, Oceanographic Division.
Jarkko Salojarvi et al. from Helsinki University of Technology published this dataset in 2005. It contains pre-computed feature vectors derived from eye movement trajectories, designed for a classification task. The data is structured around assignments where a question is followed by ten sentences, each labeled as correct, relevant, or irrelevant.
A dataset for binary classification of fake news, likely containing bimodal data from the Weibo platform. The description indicates a focus on fake news detection, suggesting the data is structured for this machine learning task. Specific details on size, collection time, and authorship are not provided in the input.
Gossipcop Dataset is a bimodal dataset for fake news and misinformation detection. The dataset is hosted on Kaggle, but its author, organization, and specific creation details are unknown. Its size, row count, and last update date are also unspecified.
Movies Dataset is a collection of film-related information published on Kaggle. The dataset's specific content, size, and creation details are unknown from the provided metadata. Users must download the data to verify its scope, features, and potential applications.
Buoy NH-10 off Newport, Oregon, collected a four-year time series of near-bottom ocean chemistry. Measurements include partial pressure of carbon dioxide, pH, dissolved oxygen, temperature, and salinity, targeting ocean acidification research. The data originates from NOAA's Ocean Acidification Program and is archived in the Ocean Carbon and Acidification Data System.
Kaggle hosts this dataset titled OmicsExpressionProteinCodingGenesTPMLogp1. The title suggests it contains omics expression data, likely transcript per million (TPM) values for protein-coding genes. The dataset's author, organization, and specific collection details are unknown.
National Oceanic and Atmospheric Administration (NOAA) collected surface underway data from R/V Ryofu Maru III cruises in the Pacific Ocean from January to September 2019. The dataset includes measurements of partial pressure of carbon dioxide in water, barometric pressure, sea surface salinity, and sea surface temperature. Instruments used include a carbon dioxide gas analyzer and a shower head chamber equilibrator for autonomous CO2 measurement.
Rebecca A. Adelman's book analyzes the visual culture surrounding America's Global War on Terror since the 9/11 attacks. The work examines images such as security footage, film portrayals, memorials, and airport security graphics, tracing their role in shaping citizenship and state formation. It proposes a new methodology for studying visual cultures of conflict, violence, and suffering.
A dataset for fake news detection, likely containing news articles or social media posts labeled for veracity. It is published on Kaggle and includes text in both English and Bangla languages. The specific source, collection method, and size are unknown from the provided metadata.
Top Rated Movies is a dataset published on Kaggle. The title suggests it contains information about films that have received high ratings, though specific columns and sources are unknown. The dataset's size, license, and last update date are not provided in the available metadata.
100,000 Amazon product reviews from the shoe category, each with a star rating from 1 to 5. The dataset was created by juliensimon and is used to train a DistilBERT-based classification model. It was last updated on March 22, 2026.
Geological data supplements a journal article on multistage, multidirectional Tertiary shortening and compression in north-central New Mexico. The dataset is hosted in the open Geological Society of America Data Repository. It was contributed by the organization SCIOPS.
Ion concentrations from the SPRESSO ice core drilled in Antarctica during the 2002-2003 field season. Measurements were made using ion chromatography and inductively coupled plasma mass spectrometry on co-registered samples. The dataset was produced by the International Trans-Antarctic Science Expedition (ITASE) and archived by AMD_USAPDC.
Clean Coastal Waters is a scientific report analyzing the causes and impacts of nutrient pollution in coastal ecosystems. The document was published by the US National Academy of Sciences and utilizes results from the Inter-American Institute for Global Change Research (IAI) Project ISP Round 1, number 3. The online version is accessible via the National Academies Press.
Experimental data from 2004 measures the in-situ production rate of radioactive 14CO in stored air samples. The study was conducted by transporting cylinders between Christchurch and McMurdo Station in Antarctica during February. The work was undertaken by the SCIOPS organization.
12,800 human-labeled short statements were collected from politifact.com's API. Each statement was evaluated by a PolitiFact editor for its truthfulness, and the labeler provides a detailed analysis report to ground each judgment. The dataset was created by DomLoyer and last updated in April 2026.
9,940 top-rated movies are listed in this dataset. It includes ratings, popularity metrics, and plot summaries for each film. The dataset was sourced from Kaggle, but the author, organization, and last update date are unknown.