Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,846 datasets
Supplementary files for a systematic review published in the Journal of the Endocrine Society in 2026. The materials include the full search strategy and detailed risk-of-bias assessments for studies on growth hormone deficiency retesting during puberty. Author Joeri Vliegenthart contributed these files to DataverseNL.
OmniPrint-MD-5-bis is a dataset of 28,240 synthetic RGB images (128x128 pixels) across 706 character categories, generated using the OmniPrint synthesizer. It was created by Haozhe Sun in June 2021 as part of the Meta-Album collection for few-shot learning. The images are synthesized with specific nuisance parameters like perspective transformation, rotation, and random backgrounds from Pexels.
800 RGB images of human cells, each labeled with one of 20 protein organelle localizations. This dataset is a curated subset of the Human Protein Atlas, originally from a Kaggle competition, where multi-label images were filtered and resized to 128x128 pixels. The data was released under CC BY-SA 3.0 by the Meta-Album project in June 2022.
800 images of humans performing 40 distinct actions, preprocessed to a uniform 128x128 pixel resolution. The dataset is a curated subset of the original Stanford 40 Actions dataset, released under the Meta-Album project for few-shot learning. It was created by Jilin He for Meta-Album in March 2022.
OmniPrint-MD-mix is a synthetic dataset of 800 character images across 20 categories, created by Haozhe Sun in 2021. Images are 128x128 RGB and were generated using the OmniPrint synthesizer with specific nuisance parameters like perspective transformation and rotation. The dataset is released under a CC BY 4.0 license as part of the Meta-Album collection for few-shot learning.
Tshopo Province in the Democratic Republic of the Congo is the geographic focus of this dataset. It supports a costing analysis for human-centered design interventions aimed at reducing the number of zero-dose children. The dataset was authored by Frederic Debellut and last updated on May 12, 2026.
A dataset for imitation and robot learning, formatted for the LeRobot v2.1 framework. It contains episodes of data from the LivUMI dual-arm robot, including multiple camera feeds, depth information, proprioceptive state, and time indexing. The dataset was created by livsynrobotics and last updated on 2026-04 08.
Coordinates for analyzed basalt samples from the Dalhousie Group, part of a study on the Early Devonian tectonic evolution of the Ganderia domain in the Chaleur Bay Synclinorium. The dataset was authored by Jaroslav Dostal and last updated on April 8, 2026. It is a small file of 18.3 KB.
Table S2 contains SmβNd isotopic data for volcanic rocks of the Dickie Cove and Tobique groups. The dataset was authored by Jaroslav Dostal and published on figshare under a CC-BY-4.0 license, last updated on April 8, 2026. The 20.6 KB XLSX file likely supports research on the Early Devonian tectonic evolution of the Ganderia domain in northern New Brunswick.
Table S3A provides location coordinates for analyzed rhyolite samples from the Dalhousie Group. The dataset is a small supplementary file of 18.0 KB in XLSX format, created by researcher Jaroslav Dostal. It was last updated on April 8, 2026.
Table S4 provides zircon saturation thermometry temperature estimates in degrees Celsius for rhyolitic rocks from the Dalhousie Group. The dataset, authored by Jaroslav Dostal and shared via figshare, is a small 18.3 KB XLSX file. It was last updated in April 2026.
Geochemical data provides major and trace element analyses for basaltic rocks from the Dalhousie Group. The dataset, published by author Jaroslav Dostal on figshare, is a single Excel table (XLSX) of 29.6 KB. It was last updated in April 2026.
Annotated Textile Fabric Image Dataset for Visual, Composition, and Material is a collection of images for computer vision tasks. The dataset is hosted on Kaggle, but details on its size, creator, and license are not provided. Its specific update date and collection methodology are also unknown.
Brazilian presidential interviews from the Gallery of Former Presidents and major newspapers, compiled by Jaqueline Damasceno. The dataset includes original Portuguese interviews and their English translations, uploaded to ProfilerPlus.org for Leadership Trait Analysis. The record was last updated on April 22, 2026.
Replication files for the paper 'Nonparametric Derivative Estimation via Local Linear Forests' include all code, simulation scripts, and an empirical application. The repository contains a master script (master.R) to reproduce all results, tables, and figures. The empirical dataset (disc.dta) originates from Lang and Manove (2011).
39,000 cross-national regressions form the basis of a study examining the relationship between democratic governance and water access. The research investigates access to basic water versus safe water (free from fecal and chemical contamination) on premises. The dataset was created by Evan Lieberman and last updated in April 2026.
Spanish finetuning data for embedding models, adapted from the upstream dataset card of KaLM-Embedding/KaLM-embedding-finetuning-data. The data maintains a training-oriented triplet or list structure and is organized as multiple parquet-backed subsets that can be loaded independently or combined. It was created by KaLM-Embedding and was last updated on the platform on 2026-04-22.
SkillFlow Test Tasks is a repository of 166 runnable tasks used in the SkillFlow benchmark for evaluating autonomous agents. The tasks are organized into 20 workflow families spanning five domains, including finance, healthcare, and governance. The dataset was created by author zhang-ziao and was last updated on 2026-04-21.
14,738 test cases across 804 Korean PDFs in 7 industrial document categories, designed to fill the gap in standardized Korean OCR evaluation. The benchmark was developed by ONTHEIT and last updated on the platform in April 2026. It addresses the lack of Korean-language focus in existing OCR benchmarks by using real-world documents.
Supplementary Material 2 is a 344.0 KB XLSX file published on figshare by Pei-Ran Li in April 2026. The data likely contains tabular results supporting research on Midkine as a prognostic and therapeutic target in meningioma. The columns suggest findings from single-cell analysis and organoid-based drug validation experiments.