Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,733 datasets
A three-dimensional bubbly vertex model introduces a cellular bulging instability at topological defects, which is stronger than in standard vertex models. The dataset, authored by Oliver M. Drozdowski and harvested by heiDATA, was last updated on April 21, 2026. It is used to study the interplay between cell extrusion and curvature in epithelial tissues, with analysis informed by three-dimensional imaging data of spherical mouse colon organoids.
A global monthly climatology of oceanic total dissolved inorganic carbon (DIC) was created using a neural network trained on GLODAPv2.2019 and LDEOv2016 data. The model uses predictor variables including position, temperature, salinity, and nutrients to produce a grid with 1°x1° spatial resolution and 102 depth levels. The dataset was published by the National Oceanic and Atmospheric Administration (NOAA) and last updated in March 2026.
Hongwei Li's dataset presents properties of organic-inorganic hybrid scintillating glasses designed for high-energy particle detection. The glasses are described as highly transparent, homogeneous, and size-customizable, with enhanced scintillating efficiency and rapid luminescence decay. Their low density enables selective α-particle detection and minimized γ-ray interference, and they can be doped for thermal neutron detection.
Axis-V1 is a high-quality curated dataset of 1,000 high-resolution images, each paired with a dense, professional descriptive caption. Developed by refine-axis, it is designed for training advanced AI models. The dataset was last updated on April 8, 2026.
Measurements taken for the purpose of validating remote-sensing-derived particulate organic carbon. The dataset is hosted by the NASA OB_DAAC organization and is present on multiple platforms including NASA Earthdata and Data.gov. Its most recent update was recorded as March 12, 2026.
A dataset of preferences and opinions from farmers belonging to a cooperative organization. The dataset is 104.3 KB in size and was authored by Prakash M C. It was last updated on 2026-04-18.
Grants to Voluntary and Community Sector organisation is a dataset published on the eu_open_data platform. The dataset likely contains records of financial grants awarded to nonprofit organizations. It is provided by the Government Digital Service under the Open Government Licence.
Forty six sampling stations along the Australian Antarctic Territory coast collected marine hydroids from depths of 2 to 640 meters. The historically significant collection from Sir Douglas Mawson's BANZARE expeditions was originally deposited at the British Museum and later sent to the National Museum of Victoria for identification. This metadata record describes a previously unstudied collection from 1929-1931.
Seasonal oxidant influx modifies redox conditions, transforming high-molecular-weight humic substances into low-molecular-weight polar compounds and shifting microbial phosphorus metabolism pathways. This dataset integrates field monitoring, molecular dissolved organic matter characterization, and metagenomic analyses to elucidate the coupling between geogenic phosphorus, phosphorus-containing DOM, and microbial functional pathways in alluvial-lacustrine aquifers. It links dissolved inorganic phosphorus fluctuations to a degradation gradient of P-containing DOM and concurrent adjustments in microbial metabolism.
OCHA Afghanistan maintains this 3W (Who does What Where) dataset tracking humanitarian activities across districts and clusters. Updated through March 2026, the data identifies organizational presence to facilitate coordination and identify service gaps.
A four-year field experiment investigated the impact of climate change factors on soil organic carbon in a subtropical rice paddy. Data from this study illustrates significant declines in subsoil carbon under elevated CO2 and warming conditions. The dataset was authored by Xueli Ding and published via figshare in April 2026.
Bridge-CoT is a dataset of 35,357 samples for robot manipulation, derived from BridgeDataV2. Each sample pairs a scene image with a task description and includes structured VLM-generated annotations for object detection, spatial relations, and subgoal decomposition. The dataset was created by CliffKai and was last updated on Hugging Face in April 2026.
A cleaned and corrected version of the Tobacco3482 document image classification dataset, addressing significant labeling errors from the original source. The dataset was uploaded by user anirudh1112 to Hugging Face and was last updated on 2026-04-23. It integrates corrections from the research community to provide a higher standard for model evaluation.
Cleaned labels derived from the 'mychen76/invoices-and-receipts_ocr_v1' dataset. The dataset was created by user sharvinmalshe and was last updated on Hugging Face on 2026-05-24. It likely contains processed text extracted from scanned receipts and invoices.
Cole Lowman's study analyzes how environmental organizations in Buffalo, NY, signal gender inclusivity through pronoun usage on their websites. The dataset, last updated in March 2026, is a 675.2 KB document containing findings from a website review using Critical Signaling Theory. It reports that only 6.9% of analyzed websites included pronouns in staff bios, with inconsistent usage.
38 civic organizations across 10 U.S. states are tracked in this individual-level dataset from the Civic Power Lab. It measures participation, leadership development, and political influence over time to study the gap between civic engagement and governing power. The data were collected under agreements with Harvard Kennedy School and last updated in March 2026.
75,285 samples of images paired with multiple-choice question-answer items, forming a training dataset for the CapRL-3B image captioning model. The dataset was created by internlm and was last updated on April 16, 2026. It is designed for a two-stage training objective where caption quality is evaluated through the answerability of visual questions.
Requests for datasets on the Edmonton open data platform have been tracked since automated intake began on January 26, 2016. The dataset is updated daily at 6:30 am by the data.edmonton.ca organization. It contains records of public requests for data, including their description, status, and assigned department.
A collection of transcripts from the MSNBC news network, spanning approximately 2003 to 2022. The dataset includes about 16,000 transcripts from 2003-2014 and a more recent scrape covering 2010-2021. It was authored by Gaurav Sood and is hosted on Harvard Dataverse.
26 headwater catchments across four regions in Quebec's boreal and boreal-arctic transition zone provide dissolved organic carbon concentrations and composition data. Adrien Simonet compiled this dataset from terrestrial and aquatic compartments during summer sampling campaigns from 2021 to 2024. The data spans a significant latitudinal gradient from 48.9°N to 59.1°N, covering the La Romaine, Eastmain, Peribonka, and George River watersheds.