Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,296 datasets
Connor Edwards provides computational data supporting a research article on machine-learned interatomic potentials. The dataset includes ab initio molecular dynamics trajectories for nine metal-organic frameworks, CP2K input files, and analysis scripts. It was last updated on 2026-04-28.
A filtered collection of printed word images paired with their corresponding transcriptions in the Malayalam language. The dataset is a Malayalam-only subset derived from the 'darknight054/indic-mozhi-ocr' collection and was uploaded by author 'trysem'. The dataset listing was last updated on 2026-06-13.
Australian Ocean Data Network hosts benthic chamber flux data from Port Phillip Bay collected between February 1995 and January 1996. The dataset focuses on spatial and temporal consistency of nutrient fluxes, chamber calibration, bio-irrigation, and direct measurements of N2 for denitrification assessment. Key results include repeated flux patterns from 1994, insensitivity to stir rate, and confirmation of denitrification via gas chromatography and mass spectrometry.
115,800 images across 1,000 classes, with a maximum of 1,280 and a minimum of 5 samples per category, form the OpenMMlo dataset. It also includes 18,000 images for out-of-distribution detection. The dataset was constructed by MiaoMiaoYang by extending open-source datasets like ImageNet-LT, iNaturalist2018, and Places-LT.
A gravity anomaly image derived from approximately 1.8 million gravity observations in the Australian National Gravity Database and a 2013 Riverina survey. The Commonwealth, State and Territory Governments, the mining industry, universities, and research organizations collected the data from the 1940s onward. Terrain corrections using offshore bathymetry and onshore topography were applied to produce the complete Bouguer anomalies.
Geoscience Australia's Marine Sediment database contains point data describing physical properties of seabed samples. The collection spans over a century, with samples acquired between 1905 and 2017 from the Australian Exclusive Economic Zone, Antarctic Territory, and surrounding waters. Attributes include survey details, location, water depth, and sediment properties like grain size percentages and texture classification.
SetCon training datasets provide the annotations used for training and evaluating the SetCon model for open-ended referring image and video segmentation. The dataset was created by author rookiexiong and was last updated on May 20, 2026. Its structure includes separate directories for image and video annotations, with files named for specific subsets like 'grefcoco' and 'muse'.
EPIC-Bench is a Mask-Grounding-based benchmark designed to evaluate Vision-Language Models' visual perception in embodied scenarios. The benchmark covers 3 high-level categories and 23 task types, following a realistic embodied workflow. It was created by author rxc205 and last updated on 2026-05-18.
Jaesung Choi published this dataset on figshare in 2026. It contains parameter values used for numerical simulations of an intracellular calcium oscillation model. The dataset includes 1000 synthetic trajectories per dynamical pattern across three distinct behavioral regions for training and testing an LKCNN classifier.
The Terrorist Designation Dataset (TDD) traces armed groups across four counterterrorism designation regimes: the U.S. Foreign Terrorist Organizations list, the EU Common Position, the UN 1267 ISIL/Al-Qaida Sanctions regime, and the EU list implementing it. It contains 267 observations across 118 unique armed groups, with 48 groups appearing in more than one regime. Data were compiled by Reem Arif from official sources and cross-checked against OpenSanctions, with a codebook version dated March 2026.
A multimodal dataset for labeling 3D keypoints on animal behavior, created by Jinyao Yan and last updated on 2026-05-04. The dataset is 732.7 MB in size and includes files in YAML, MP4, CSV, and REDPROJ formats under an MIT license.
A dataset from data.ny.gov last updated on April 20, 2026, showing the locations of toll gantries across the New York State Thruway Authority's roadways. It likely contains records for each gantry, which are equipped with cameras, E-ZPass readers, and license plate readers for cashless tolling. The dataset includes geographic coordinates, road names, mileposts, and county information.
A packaged release of precomputed 3D annotations derived from BEHAVIOR simulation episodes, used for training and evaluating the PointWorld world model. The dataset, created by NVIDIA, contains episode-level HDF5 files storing robot state, camera parameters, initial RGB-D observations, and rigid-body scene geometry. The repository was last updated on May 7, 2026.
LlamaSeg is a dataset for image segmentation via autoregressive mask generation. The data is provided as JSON annotation files packaged in compressed shards, following the naming convention of the SA-1B dataset from Segment Anything. It was authored by GML-FMGroup and last updated on 2026-05-27.
DFAT's official organizational file details Australia's international development funding. Published by the Department of Foreign Affairs and Trade, this data complies with International Aid Transparency Initiative standards to enhance accountability. The file was last updated in April 2026.
CASSINI S INMS LEVEL 1A EXTRACTED DATA V1.0 from NASA includes all mass samples for the entire Cassini mission. The data set contains mass spectra from instrument checkout, Saturn Orbit Insertion (SOI), and the full Saturn tour. It is organized as a spreadsheet with one row per sample period, containing ancillary data and counter outputs.
Paris Club agreements from 1956 to April 2026 cover 543 debt restructuring deals involving 102 debtor countries and approximately USD $863 billion in treated debt. The dataset was constructed by Jochem van der Zaag through systematic web scraping and manual validation of the Paris Club's official website, providing a significant update to a 2016 predecessor. It includes agreement-level metadata such as debtor and creditor countries, monetary amounts, treatment terms, and supporting documentation links.
A preliminary report on the manganese nodule field southwest of Western Australia quoted chemical analyses carried out on air-dried material. The average water content after drying at 105ยฐC has been determined at 16 percent. Metal values by atomic absorption spectrophotometry have been recalculated assuming this moisture content.
A set of digital bathymetry, gravity and magnetic grids for southwest Australia (24-46S, 106-140E) produced by the Australian Geological Survey Organisation. The grids were created using all available land, marine and satellite data, processed through a network adjustment on marine ship-track data. The work was done in cooperation with Desmond Fitzgerald & Associates and the Australian Hydrographic Office.
A 107.6 KB dataset from figshare, last updated April 2026, containing binding affinity measurements for 50 type II kinase inhibitors across 348 kinases. The data combines results from two studies, the 'Davis' and 'Schrรถdinger' data sets, and was authored by Vardan H. Vardanyan to investigate the role of protein conformational reorganization in kinase selectivity.