Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,926 datasets
Over 1100 soil profiles were assembled to analyze changes in organic matter storage after land conversion. The data, authored by W. M. Post of Oak Ridge National Laboratory, shows an average carbon loss of 23% for soils with high initial carbon at 1-meter depth, while nitrogen loss averaged 6%. Regression analysis indicates carbon loss increases with initial soil storage and is influenced by the C:N ratio.
Per capita ethanol consumption data for persons aged 14+ in all U.S. states and Washington D.C. from 1977 to 2016. The dataset includes total consumption and separate figures for beer, wine, and spirits, originally compiled by the National Institute on Alcohol Abuse and Alcoholism. It was scraped from a PDF report and formatted for analysis by Jacob Kaplan.
Dataset OSD contains bottle and Conductivity-Temperature-Depth (CTD) data collected from multiple French and international platforms worldwide. Measurements support the World Ocean Circulation Experiment (WOCE) and Joint Global Ocean Flux Study (JGOFS) programs. Bottle parameters include dissolved oxygen, nitrate, nitrite, silicate, chlorophyll-a, dissolved organic carbon, and total phaeopigments, while CTD profiles capture temperature and salinity.
Temperature loggers deployed at Gannet Cay Reef collected sea water temperature data from 13 August 2014 to 07 November 2025. The data was aggregated by the Australian Ocean Data Network and last updated on the platform in March 2026. The specific number of loggers, sampling frequency, and data volume are not detailed in the provided metadata.
ImageNet-1K Animal Classes Mini 100 is a subset of the ImageNet-1K dataset focused on animal categories. It contains 100 images per class. The dataset was sourced from Kaggle, but the original author and license are unknown.
Animal classes extracted from the ImageNet-1K dataset. The partial subset contains 500 images per class. The dataset was sourced from the ImageNet project and is hosted on Kaggle.
Decai Gao's dataset contains global patterns and factors driving nutrient use efficiency of soil microorganisms. The data was generated for a study involving statistical analysis, data visualization, and model simulation using R code. It includes information on soil organic carbon, microbial carbon use efficiency, and enzymatic activity.
RefCOCO-Degraded is a benchmark dataset for evaluating the robustness of vision-language models. Images are modified with simulated degradation effects such as fog, smoke, and thermal noise. The dataset's author, organization, and specific scale are not provided in the input metadata.
Measurements of oceanic carbon dioxide were collected during the Belgica 9815D research cruise in the Bay of Biscay. The dataset contains 13 measurements each for partial pressure of carbon dioxide (pCO2) and total inorganic carbon (TCO2), computed from pH and alkalinity. Data was gathered by Michel Frankignoulle as part of the JGOFS and OMEX programmes during a cruise from July 10 to 14, 1998.
July 1997 data from the Belgica 9714D research cruise in the Bay of Biscay contains 44 paired measurements of seawater pCO2 and total inorganic carbon concentration. The dataset was created by researcher Michel Frankignoulle as part of the JGOFS and OMEX international programs. Measurements were computed from in-situ pH and alkalinity readings.
Michel Frankignoulle collected 30 measurements each of seawater partial pressure of carbon dioxide (pCO2) and total inorganic carbon concentration (TCO2) during the Belgica 9919A research cruise. The data were gathered as part of the JGOFS and OMEX programs between August 30 and September 3, 1999. The dataset is provided by the SCIOPS organization.
June 1997 measurements from the Belgica 9714B research cruise in the Bay of Biscay. The dataset contains 12 pCO2 and TCO2 measurements and 9 ammonium concentration measurements, collected by researchers Michel Frankignoulle and Malcolm Woodward. It was produced for the OMEX and JGOFS programmes.
KnowGen Bench provides evaluation data for Gen-Searcher, a multimodal agent for image generation requiring complex real-world knowledge. The benchmark supports training and testing agents that search the web, browse evidence, and reason over multiple sources. It was created by the GenSearcher team and last updated in April 2026.
Mall_Surveillance_Object_Detection is a dataset for training YOLOv8 models. It contains annotated images for detecting four object classes: bag, busket, people, and product. The dataset was sourced from Kaggle, but its author, size, and update history are unknown.
A research paper authored by Mehrsa Bakhtiyari from Tehran Markaz Azad University. The paper's content focuses on the topic of stress management within organizational contexts. It is published under an Open Access license.
100,000 images across 200 classes, with 500 training, 50 validation, and 50 test images per class. This version on OpenML provides links to a 20-image subset per class for framework testing. The dataset was created by Jiayu Wu, Qixiang Zhang, and Guoxi Xu and is licensed under DbCL v1.0.
An Ask Me Anything session with Mat Todd and Alice E. Williamson from the Open Source Malaria Consortium. The transcript covers open-source drug discovery, a recent paper, and malaria medicines. It is published on paperswithcode under an Open Access (diamond) license.
Information relating to the DBS Organisation, published on the eu_open_data platform. The dataset is provided by the Government Digital Service, though its specific temporal coverage and scale are not detailed in the available metadata.
Persian Number OCR Dataset is a collection of images for optical character recognition tasks. It is hosted on Kaggle, but the author, organization, and specific collection details are not provided. The dataset's size, format, and annotation specifics are unknown from the available metadata.
License plate images annotated with line-level bounding boxes for optical character recognition tasks. The dataset is hosted on Kaggle, a platform for data science competitions and projects. Specific details regarding the number of images, collection source, and creation date are not provided in the available metadata.