Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
16,012 datasets
The GEMS-GLORI world river discharge database, circulated by UNEP for review in 1996, lists 555 major rivers discharging to oceans. It contains up to 48 attributes per river, including major ions and nutrients in various forms, with about 10,000 individual data points compiled from 500 references by author Michel Meybeck.
Alisha C. Holland's research paper, published by Harvard University Press, analyzes the role of organizational and hybrid brokers in vote-buying networks. The study uses case studies of street-vending organizations in Colombia and peasant organizations in Mexico to illustrate these concepts. The dataset likely contains information on broker types, organizational membership, and political systems.
A dataset for training SVG-LLMs using multi-task, multi-reward reinforcement learning, created by InternSVG. It includes Chain-of-Thought training data for image-to-SVG and text-to-SVG tasks. The dataset page was last updated on 2026-03-18.
CO2SYS is a program for calculating parameters of the carbon dioxide system in aquatic environments. It uses two of four measurable parameters—total alkalinity, total inorganic CO2, pH, and fugacity or partial pressure of CO2—to compute the remaining two under user-defined temperature and pressure conditions. The software was developed by Ernie R. Lewis of Brookhaven National Laboratory, replacing and extending earlier programs released in May 1995.
A global three-dimensional grid of total inorganic carbon (TCO2) and total alkalinity (TALK) for ocean waters below the deepest winter mixed layer depths. The dataset was created by NOAA using interpolation methods applied to high-quality data from the WOCE, JGOFS, and OACES programs. It provides monthly climatological estimates on a 1° × 1° × 32-layer grid for use in ocean-atmosphere carbon dioxide system models.
Model-YOLO-Segmen is a dataset for computer vision tasks, specifically related to the YOLO (You Only Look Once) model architecture. It was published on Kaggle, a platform for data science and machine learning projects. The dataset's specific content, size, and creation details are not provided in the available metadata.
A dataset titled 'derm1m-cnn-134' hosted on Kaggle. The title suggests it contains dermatology-related images, likely intended for training or evaluating convolutional neural network models. No further metadata on size, source, or creation date is available.
A seven-month time series of dissolved inorganic carbon, pH, temperature, and salinity collected from a moored buoy near the coast of Honolulu, Hawaii. The dataset was created by the National Oceanic and Atmospheric Administration to evaluate a prototype autonomous DIC sensor, with performance validated against 51 discrete bottle samples and a collocated CO2 buoy.
A criterion-related validation study by Pamela Y. Skyrme investigates the Big Five personality factors for predicting employee performance. The research uses two independent samples to correlate personality traits with objective productivity and subjective training performance for outbound call center representatives. It addresses a gap in published literature regarding personality test validity for this specific job role.
A book by Graham Little analyzing the leadership of Margaret Thatcher, Ronald Reagan, and Malcolm Fraser. The text draws portraits of these leaders, examining their lives, ideologies, work-styles, and personal relationships. It also investigates the influence of their childhood and adolescence on their leadership qualities.
Paid lottery prize data details the awards claimed by players, including prize descriptions, values, and applied taxes. The dataset is published by the Colombian government via www.datos.gov.co. It was last updated in February 2026.
DepositPhotos provides a sample of high-resolution images focused on biometric and human features. The dataset is designed for computer vision tasks and was last updated on March 23, 2026. The full description is available on the dataset page.
Depositphotos provides a curated sample of high-resolution infographics, charts, diagrams, and data visualizations. This subset is designed for training and evaluating computer vision models for document parsing, chart understanding, and OCR applications. The dataset was last updated on 2026-03-23.
v1 dganet is a dataset published on Kaggle. Its title suggests a focus on computer vision, likely involving generative adversarial networks. Metadata is minimal; actual content requires verification after download.
The dataset contains summarized information on attributed beneficiary populations, affiliated providers, and financial performance for each ESRD Seamless Care Organization (ESCO). It includes measures such as beneficiary counts, provider participation, expenditure benchmarks, and shared savings or losses. The data is provided by the U.S. Department of Health & Human Services and was last updated in March 2026.
A 5.5 KB Excel file compares bounding box regression loss functions within the DPCNet architecture. The dataset, authored by Linfeng Jia and updated in March 2026, supports analysis of model performance metrics. Specific row and column counts are unknown.
5.5 KB Excel file compares a convolution-based model (DGCNN) and a transformer-based model (Point Transformer). The dataset, authored by Shuhao Fu and updated in March 2026, contains results from ablation simulations and experiments on 3D shape recognition. Tags indicate analysis of human performance, local geometric structures, and sparse visual information.
A baseline dataset for a computer vision model named ResNet152v2, published on Kaggle. The dataset likely contains images or image-related features used for training or evaluating the ResNet152v2 architecture. Specifics on data volume, source, and creation date are unavailable from the provided metadata.
CohereLabs created a multilingual evaluation set by translating 500 challenging user queries from the original English-only WildVision test set. The dataset covers 23 languages and is intended for automatic LLM judging of vision-language models. It was last updated on March 6, 2026.
National Database for Autism Research (NDAR) is an informatics platform for autism spectrum disorder data across biological and behavioral levels. It was developed by the U.S. Department of Human Services to facilitate data sharing and collaboration across laboratories. The platform supports various data types including text, numeric, image, and time series.