Loading...
Loading...
Image classification, object detection, segmentation, face recognition, OCR, image generation, video understanding
15,393 datasets
The Australian Geological Survey Organisation (AGSO) conducted marine surveys in 1992/1993 as part of the Cape York Land Use Strategy Project (CYPLUS). The data covers a 1,000 km section of the Cape York Peninsula inner shelf between Weipa and Cape Flattery. The report summarises the results of these surveys, which aimed to assess the natural resources of the region.
Two major seabed swath-mapping and geophysical surveys, AUSTREA-1 and AUSTREA-2, were completed in early 2000 by the Australian Geological Survey Organisation. The surveys were commissioned by the National Oceans Office and Environment Australia to support the implementation of Australia's Ocean Policy. The data covers the South-east Marine Region, including Lord Howe Island, the South-east Australian Margin, Tasmania, the South Tasman Rise, and the Central Great Australian Bight.
18,231 OCR-processed text files from annual financial statements of listed companies in Vietnam. The dataset was created by TiniX AI and covers reports from 1,491 different stock codes between 2015 and 2025.
An organogram (organisation chart) for the UK Hydrographic Office, showing all staff roles as of 31 October 2020. The data includes names and salaries for Senior Civil Servants. It was released by the Government Digital Service as part of a central government transparency initiative.
OmicBench is a 44-task benchmark anchored in biology for evaluating LLM coding agents on end-to-end multi-omics analysis workflows. Each task specifies a scientific objective and a storage target, with grading based on biology-anchored numerical criteria. The dataset was created by omicverse and last updated on 2026-05-20.
Jerry Guo's project on Harvard Dataverse contains a YOLOv8 object-detection workflow for allergen-related plants. The workflow includes code for installing packages, training a model, running detection on images, and creating an interactive map when coordinates are available. The dataset was last updated on June 5, -2026.
Geoscience Australia's Petrel Sub-basin Marine Environmental Survey GA-0335 (SOL5463) collected inorganic element data from surface seabed sediments in the Timor Sea. The survey was conducted in May 2012 aboard the RV Solander as part of the National Low Emission Coal Initiative (NLECI). This data was gathered to support the investigation of CO2 storage potential in shallow seabed environments.
Geospatial boundaries delineate areas where caribou health is linked to land attributes, supporting Ontario's Woodland Caribou Recovery Strategy from 2008. The dataset is maintained by the Government of Ontario and was last updated in March 2026. It provides GIS-ready data for habitat analysis and conservation planning.
2009 marks the start of this dataset containing water withdrawal reports from facilities in New York State. The New York State Department of Environmental Conservation (DEC) collects this data under permit requirements for facilities with a daily withdrawal capacity of 100,000 gallons or more. It includes facility name, town, county, and withdrawal information.
A high-resolution bathymetry surface grid created from a contracted national reference survey between Gantheaume Point and Talboys Rock, Broome, Western Australia. The survey was acquired for the Australian Hydrographic Office (AHO) on 25-26 September 2020 using a Kongsberg EM 2040D multibeam echosounder. The processed data is provided as a 0.5-meter resolution, 32-bit floating point GeoTIFF grid in MSL, LAT, and Ellipsoid vertical datums.
A sample of Signal Phasing and Timing (SPaT) messages transmitted by roadside units (RSUs) within the Tampa Connected Vehicle Pilot study area. The data follows the SAE J2735 standard for vehicle-to-infrastructure communication and is available in CSV, JSON, XML, and RDF formats. The full raw dataset can be requested from the ITS DataHub Sandbox.
Tampa CV Pilot data consists of Basic Safety Messages (BSMs) generated by participant and public transportation vehicles and transmitted to roadside units. This dataset is a flattened sample following SAE J2735 and J2945/1 standards, with an added geo column for mapping. The full raw BSM data can be requested from the ITS DataHub Sandbox.
Ukrainian state agencies publish annual reports on the financial plan implementation of enterprises under their management. These documents likely contain detailed figures on income, expenditures, and profits for state and municipal sector entities. The dataset's cross-platform presence indicates its role in public sector financial transparency.
The Aboriginal Lands Trust Estate comprises properties administered under the Aboriginal Affairs Planning Authority Act 1972. The dataset is maintained by the Department of Planning, Lands and Heritage to assist in managing these lands intended for sustainable use and eventual transfer to Aboriginal organizations.
A 69.1 MB collection of curated literature review data and analytical artifacts supports the article 'Architecture in Organization Studies: From a Field-Engaged Integrative Systematic Review to a Sociomaterial Circuit Framework'. The dataset, authored by Bruno Americo and last updated in April 2026, includes files in multiple formats such as tab-separated values, HTML, and presentation slides.
Genome assembly data for 14 Caragana species from different regions of China. The 545.0 MB dataset, authored by Jiaxing Song and last updated in April 2026, is provided in GFF3 format under a CC-BY-4.0 license.
This geospatial dataset provides boundaries and metadata for marine and terrestrial protected areas and other effective area-based conservation measures (OECMs) in Lao PDR. Managed by the UNEP World Conservation Monitoring Centre (UNEP-WCMC) in collaboration with the IUCN, it was last updated in March 2026. It serves as the primary source for tracking progress toward the Kunming-Montreal Global Biodiversity Framework's 30x30 target.
Metadata from online news coverage of the Egyptian music genre Mahraganat was collected from five major Egyptian and Arabic-language news websites between 2019 and 2024. The dataset includes article-level metadata such as titles, publication dates, authors, and URLs. It was compiled by M.A.F. Allam to examine the representation and debate of Mahraganat music in digital news media.
An example spreadsheet illustrating the analytical stages used to generate a specific organizational archetype, labeled Archetype B. The dataset is a 5.5 KB XLS file authored by Rami Subhi and last updated on May 6, —. It is shared under a CC-BY 4.0 license on the figshare platform.
Four ATom (Atmospheric Tomography) campaigns collected high-frequency measurements of volatile organic compounds (VOCs) using the Trace Organic Gas Analyzer (TOGA). The dataset includes radical precursors, tracers of anthropogenic and biogenic activities, and compounds relevant to aerosol formation and atmospheric processing. Data were produced by ORNL_CLOUD, with the most recent metadata update noted as March 2026.