Loading...
Loading...
Curated collections of datasets organized around trending themes and use cases — from autonomous driving to LLM fine-tuning.
84 collections available
This collection comprises gridded sea surface spiciness data derived from the NASA-CONAE Aquarius satellite mission, covering daily, 7-day, monthly, seasonal, and annual timescales at a 1-degree spatial resolution. Spiciness is a derived variable combining temperature and salinity, crucial for analyzing ocean density structure, water mass formation, and thermohaline circulation. The datasets provide a consistent, global-scale time series from a dedicated satellite instrument, supporting the study of seasonal patterns, interannual variability, and long-term climatological trends. Together, they form a foundational resource for validating and calibrating ocean circulation and climate models.
This collection covers geological reports, geochemical analyses, and reservoir data from numerous sedimentary basins across Australia, including the Canning, Otway, Browse, Bonaparte, and Amadeus basins. It provides detailed information on source rock potential, hydrocarbon accumulations, stratigraphy, and tectonic history. These datasets are integral for conducting petroleum systems analysis, which involves modeling hydrocarbon generation, migration, and accumulation. The focus on Australian basins offers a regionally coherent set of data for exploration risk assessment and prospect evaluation.
This collection covers high-resolution multibeam sonar bathymetry grids, backscatter data, and seafloor imagery from marine surveys around Australia's coastline and marine parks. It includes visual flythroughs of underwater canyons and reefs, as well as derived habitat classification maps. The data supports detailed analysis of seafloor geomorphology, sediment composition, and benthic ecosystem distribution. It is particularly valuable for studies focusing on Australia's unique marine environments, from the tropical Coral Sea to the temperate Tasmanian shelf.
This collection contains high-resolution gravity anomaly grids, including Bouguer, free-air, isostatic residual, and various vertical derivative products, covering Australia and its continental margins. It provides the foundational data for modeling subsurface density variations and mapping geological structures. The datasets are derived from a dense network of ground, airborne, and marine observations, offering consistent national coverage suitable for regional analysis and detailed local studies in mineral and petroleum exploration.
This collection comprises authoritative geospatial datasets detailing Australia's maritime jurisdiction, including territorial seas, exclusive economic zones, and continental shelf limits. It features treaty-defined boundaries with neighboring nations and standardized regional maps for specific areas like the Timor Sea and Torres Strait. The data supports the precise delineation of legal maritime zones and the analysis of spatial extents for compliance and policy research.
This collection provides integrated geospatial layers for the City of Hobart, covering urban infrastructure such as stormwater pipes and nodes, road centerlines, and building footprints, alongside cadastral parcel boundaries and environmental features like tree canopy and river boundaries. The data enables municipal engineers to model drainage networks, assess flood risks, and plan infrastructure maintenance by combining these core vector and polygon datasets. Urban planners can use the complementary layers to analyze land-use patterns, overlay property boundaries with environmental constraints, and support 3D city modeling. The datasets are sourced from the local government and are maintained to support comprehensive, location-specific urban analysis and decision-making.
This collection covers the official geospatial data underpinning the Hobart Interim Planning Scheme 2015 and related planning instruments, including zoning boundaries, heritage area overlays, biodiversity protection zones, and specific overlays for areas like Sullivans Cove and Fern Tree. It provides the regulatory spatial framework for land use analysis across the city. These datasets are complementary and designed to be used together, enabling users to layer multiple planning constraints—from height setbacks to acid sulphate soils—onto a single parcel of land for a holistic development assessment. As the authoritative source from the City of Hobart, this integrated collection supports the complete workflow for municipal planning, development feasibility studies, and regulatory compliance within this specific jurisdiction.
This collection provides standardized eddy covariance flux tower measurements from Australian dryland agricultural and rangeland sites, including Mitchell Grass plains and grazed pastures. The data covers key ecosystem fluxes such as Net Ecosystem Exchange (NEE), Gross Primary Productivity (GPP), Ecosystem Respiration (ER), and surface energy balance components. All datasets are processed into final, gap-filled products using the established PyFluxPro software pipeline, ensuring methodological consistency. This enables comparative analysis and modeling of carbon, water, and energy dynamics across a network of arid and semi-arid Australian ecosystems.
This collection provides standardized eddy covariance flux tower measurements of energy and mass exchange, including Net Ecosystem Exchange (NEE), Gross Primary Productivity (GPP), and Ecosystem Respiration (ER), from a variety of Australian ecosystems such as semi-arid woodlands, almond orchards, wetlands, and dry sclerophyll forests. All data is processed into final, gap-filled products using the established PyFluxPro methodology, ensuring methodological consistency crucial for comparative analysis. This enables researchers to study carbon cycling dynamics across different land covers and management practices. The curated set supports the calibration and validation of terrestrial carbon flux models for Australian conditions.
This collection consists of underway oceanographic sensor data, such as sea surface temperature and salinity, collected during specific voyages of the Australian research vessel RV Solander under the Integrated Marine Observing System (IMOS). The data provides time-series measurements aligned with precise vessel tracks, offering a consistent source for studying marine conditions across different years and seasons. By compiling data from numerous voyages, it supports longitudinal analysis of environmental patterns and changes in Australian coastal and open waters.
This collection covers structured indices detailing classified and reserved information from various Colombian government agencies, ministries, and municipalities. It provides metadata on classification categories, legal justifications, responsible officials, and secrecy durations. The data supports the analysis of transparency law compliance and patterns in state information restriction. By aggregating indices from multiple jurisdictions, it enables comparative research into how different government bodies apply secrecy rules.
This collection comprises metadata registries detailing public information assets held by various Colombian national agencies, departments, and municipalities. The data typically includes fields such as asset category, format, responsible department, publication status, and update frequency. It supports the analysis of data governance practices, transparency compliance, and the distribution of information types across the public sector. Researchers or auditors can use this standardized metadata to benchmark accessibility and identify gaps in public data availability.
This collection provides global coverage of the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) anomaly index, derived from VIIRS satellite composites at a 10-day temporal resolution. The data is formatted as geospatial rasters on a standardized 0-1 scale, enabling direct comparison of vegetation productivity deviations from historical norms across different countries and ecosystems. It supports consistent time-series analysis for detecting agricultural drought, monitoring crop health, and assessing environmental stress on a worldwide scale.
This collection covers high-resolution hydrodynamic and biogeochemical model results for the Great Barrier Reef, including simulations of river tracer dispersion, sediment transport, and nutrient and pesticide concentrations. It provides detailed spatial and temporal data from various model configurations and scenarios, such as reduced catchment loads and pre-industrial baselines. These datasets are designed to support the specific research and monitoring tasks outlined in initiatives like the Reef 2050 Water Quality Improvement Plan, enabling the analysis of pollutant pathways and the evaluation of management interventions.
This collection comprises underway oceanographic sensor data from the RV Cape Ferguson, covering multiple voyages in Australian tropical waters over several years. It provides consistent time-series measurements of environmental variables such as temperature, salinity, and fluorescence, collected under the standardized IMOS Ships of Opportunity (SOOP) facility. The data from a single, well-instrumented vessel enables direct comparison across time and space, supporting analyses of seasonal and interannual variability. This longitudinal dataset is particularly valuable for researchers focused on this specific marine region, as it offers a coherent and comparable record for validating regional ocean models and studying long-term environmental trends.
This collection provides high-resolution gridded population estimates, disaggregated by age group and gender, for numerous countries and territories worldwide. The data covers a multi-year time series from 2015 to 2030, enabling longitudinal analysis of demographic shifts. It supports spatial modeling tasks such as analyzing population density, projecting service needs for specific age cohorts, and assessing population exposure to environmental risks. The consistent format across diverse geographic regions allows for comparative studies and cross-national demographic analysis.
This collection covers detailed business registry data across Colombia, including company names, tax IDs, economic activity codes (CIIU), geographic locations, and contact information. It spans multiple industries and municipalities, providing a granular view of the commercial landscape. The data supports tasks such as competitive intelligence, market sizing, and identifying regional business concentrations. Its value lies in enabling cross-regional and cross-sectoral analysis for comprehensive market understanding.
This collection comprises detailed site survey reports and maps for the Agricultural Land Classification (ALC) system in England, produced between 1989 and 1999. It provides granular soil data, including the subdivision of Grade 3 land into subgrades 3a and 3b, based on the official grading methodology. The surveys are essential for analyzing soil characteristics, limitations, and land quality at specific locations. Together, they support a consistent workflow for historical land capability assessment relevant to local development planning and agricultural land use decisions.
This collection contains detailed agricultural land classification survey data for specific sites across England, primarily from 1989 to 1999. It provides soil pit descriptions, maps, and reports that apply the official 'Agricultural Land Classification of England and Wales' grading methodology, including subdivisions within land grades. These standardized historical surveys support comparative analysis of land quality and soil characteristics for individual locations. The data is valuable for site-specific land use planning and environmental studies requiring consistent, authoritative historical baselines.
This collection contains detailed soil survey maps and reports for specific locations across England and Wales, produced between 1989 and 1999. It provides agricultural land capability grades, including subdivisions like Grade 3a and 3b, based on the official government methodology. The data enables consistent historical analysis and comparison of soil characteristics and land quality for individual sites. Together, these surveys support detailed land use planning and environmental assessment for agricultural and development purposes.
This collection covers Queensland state government contract disclosures, procurement records, and related project status data across various departments including health, justice, housing, and natural resources. It provides detailed expenditure records, contract award details above specific monetary thresholds, and progress updates on digital and service initiatives. The datasets are sourced directly from authoritative Queensland government agencies and are published under permissive licenses, supporting a focused analysis of public spending and operational transparency within this specific Australian state.
This collection covers detailed records of violent incidents, victimization, and criminal events across Colombia, including homicides, sexual offenses, domestic violence, terrorist attacks, and conflict-related massacres. The data is granular, often including geographic coordinates, municipal breakdowns, victim and perpetrator demographics, and temporal information spanning multiple years. This combination supports comprehensive research into the spatial and temporal dynamics of violence, the evaluation of victim support services, and the analysis of risk factors across different regions and demographic groups within the country.
This collection provides detailed longitudinal data on the City of Melbourne's economic and physical development, covering land use classifications, employment by block, commercial and residential floor space, dwelling prices, parking infrastructure, and municipal financial statements. The data spans over two decades, allowing for analysis of long-term trends in urban density, economic activity, and property markets. It supports integrated modeling of the city's development by linking spatial, economic, and financial factors within a consistent geographic and temporal framework.
This collection consists of underway data from NOAA research cruises, primarily in the North Atlantic Ocean, Gulf of Mexico, and Caribbean Sea. It integrates concurrent meteorological, navigational, and physical oceanographic measurements into continuous time series. These datasets are essential for tasks requiring co-located atmospheric and oceanic observations, such as calculating air-sea fluxes. The multi-year coverage from a single vessel platform provides a consistent data source for analyzing regional and temporal variability in marine environmental conditions.
This collection covers datasets related to citizen requests, complaints, petitions, and administrative procedures across various Colombian government institutions, including courts, ministries, and local municipalities. It includes detailed records on request handling timelines, procedural steps, case outcomes, and geographic distribution. The data enables longitudinal and comparative studies of public service delivery, legal deadline compliance, and administrative efficiency within the Colombian state apparatus. By consolidating records from multiple branches and levels of government, it supports a comprehensive view of citizen-state interaction patterns.
This collection comprises the standardized 1:250,000 scale topographic map series covering the entire Australian continent. It provides consistent data on hydrography, 50-meter contours, road and rail infrastructure, and vegetation cover across 516 map sheets. Together, these sheets enable comprehensive, continent-wide geospatial analysis and modeling. The series supports critical workflows in transportation network planning, watershed and terrain modeling, and regional land-use assessment.
This collection provides granular data on energy and water consumption, building efficiency metrics, and renewable energy projects across New York State and New York City. It covers electricity and natural gas usage at county, ZIP code, and community levels, building energy benchmarking and audit results for municipal and private structures, and the status of distributed and large-scale renewable energy installations. Together, these datasets support a comprehensive analysis of energy demand patterns, the effectiveness of efficiency policies like Local Law 87, and the progress toward state renewable energy goals, offering a spatially and temporally detailed view for regional planning.
This collection provides standardized 1:250,000 scale topographic map sheets covering the entire Australian continent. The data includes vector features for roads, railways, and hydrography, along with 50-meter contour intervals for terrain elevation. Together, these maps support consistent, large-scale geospatial analysis for regional planning, environmental modeling, and infrastructure assessment across diverse Australian landscapes.
This collection provides a multi-year time series of water quality parameters, including nutrient levels, phytoplankton triggers, and seagrass coverage, specifically for the Swan Canning Estuary in Western Australia. It supports longitudinal analysis of ecosystem health, compliance with water quality targets, and the modeling of system responses to environmental changes and management actions. The data originates from official government monitoring programs, offering a consistent and authoritative record for researchers and resource managers focused on this critical estuarine system.
This collection covers detailed geological theses, reports, and data on mineral deposits, bedrock geochemistry, sedimentology, and geochronology from across Yukon Territory, Canada. It focuses on specific deposits and formations containing gold, copper, lead, zinc, and silver, providing lithological descriptions, geochemical signatures, and structural interpretations. The datasets are sourced primarily from authoritative regional bodies like the Yukon Geological Survey, offering a coherent, location-specific foundation for modeling the genesis and distribution of mineral resources in this prolific geological province.
This collection provides country-level datasets from the UN OCHA Financial Tracking Service (FTS), detailing humanitarian funding requirements and the corresponding financial contributions from donors. It covers data on funding flows, recipient organizations, and specific crisis tags, such as for COVID-19 response. These datasets are structured to support the direct comparison of requirements against received funds, enabling systematic analysis of resource gaps in humanitarian responses across different nations.
This collection covers geospatial datasets related to urban planning across multiple Quebec municipalities, including zoning regulations, heritage site boundaries, electoral districts, infrastructure locations, and land use plans. It provides the foundational spatial and regulatory data required for analyzing development patterns, assessing service accessibility, and modeling urban systems. The integration of data from different municipalities within the province supports comparative urban studies and regional planning workflows.
This collection provides authoritative geospatial data for the Western Australian road network, covering detailed road centerlines, legal speed limits, regulatory and non-regulatory signage locations, bridge and crossing inventories, and roadside amenities like rest areas and stopping places. The data originates from official state sources such as Main Roads Western Australia and the Integrated Road Information System (IRIS), ensuring reliability for critical infrastructure work. Together, these datasets support a complete workflow for road network analysis, asset management, safety auditing, and transportation planning within the state.
This collection provides annual maps classifying transitions between forest, sparse woody, and non-woody vegetation cover across Western Australia, derived from Landsat satellite imagery at 30-meter resolution. It enables the tracking of deforestation, reforestation, and land degradation over multiple decades. The consistent methodology and temporal coverage support longitudinal studies of environmental change and land management policy impact.
This collection provides annual classifications of woody vegetation cover across Western Australia from 1988 to the present, focusing on forest and sparse woody vegetation categories defined by cover density and height thresholds. It is derived from Landsat imagery and offers state-wide spatial coverage. The consistent time series supports longitudinal analysis of land cover change, enabling research into trends such as deforestation, regrowth, and habitat fragmentation. This data is particularly valuable for regional environmental monitoring and land management planning within the state.
This collection covers standardized geological line feature data, such as eskers, moraines, and unit contacts, from specific National Topographic System (NTS) map sheets across Alberta. The data is provided in common GIS formats like shapefiles and E00 exports, sourced from authoritative Alberta Geological Survey maps. It supports the compilation of regional geological maps, comparative landform analysis, and integration with other geospatial layers for environmental and land-use planning.
This collection comprises standardized geological maps detailing the surficial materials and landforms across various regions of Alberta, Canada. The data, primarily in GIS-ready polygon formats, includes attributes for material type, genesis, and texture, enabling the classification of terrain units. It supports consistent regional-scale analysis and mapping workflows, as the datasets share common scales, formats, and attribute schemas derived from authoritative provincial surveys. This allows for the comparative study of surficial geology and sediment distribution across different areas of the province.
This collection provides detailed flood hazard data for regions in Queensland, Australia, including Moreton Bay and Brisbane. It covers flood depth models, flood height contours, and digital elevation models for various recurrence intervals, alongside data reliability notes and real-time rainfall telemetry. Together, these datasets support a comprehensive flood risk assessment workflow, from initial modeling and scenario comparison to confidence qualification, specifically tailored for infrastructure design and land-use planning in these localities.
This collection covers geocoded parking violation data for the District of Columbia, aggregated to street segment centroids and summarized by time of day and week of year. It provides the precise spatial coordinates and temporal dimensions necessary for detailed urban analysis. Together, these datasets support a longitudinal study of parking enforcement patterns, enabling the identification of persistent hotspots and the evaluation of policy changes over time.
This collection provides authoritative vector range maps for numerous reptile and snake species native to California, sourced from the California Wildlife Habitat Relationships (CWHR) system. The data enables detailed spatial analysis of species distributions for habitat suitability modeling and conservation planning. By offering a consistent, statewide dataset for multiple species, it supports comparative studies and landscape-scale impact assessments. These range maps are designed for integration with other GIS layers, such as land use or protected area boundaries, to inform environmental reviews and wildlife management decisions.
This collection provides authoritative vector range maps for numerous reptile species native to California, sourced from the California Wildlife Habitat Relationships (CWHR) system. The data enables detailed habitat suitability modeling and species distribution analysis for conservation science. By offering a consistent, expert-reviewed geographic framework, these datasets support the assessment of land use impacts on reptile populations across the state. They are designed for integration with other environmental GIS layers within the CWHR predictive modeling workflow.
This collection provides detailed geospatial data for the Western Canada Sedimentary Basin, including isopach maps, lithofacies distributions, and structural features across key geological formations from the Cambrian to the Tertiary. It enables the analysis of sediment thickness variations, rock type distributions, and subsurface structural frameworks. These datasets are integral for constructing integrated geological models to understand depositional history and assess hydrocarbon reservoir potential.
This collection comprises datasets from the QSAR-TID series, which contain molecular descriptor features paired with target property or activity values. These datasets are hosted on the OpenML platform and are standardized for machine learning applications. They are used for developing and validating Quantitative Structure-Activity Relationship models, a core task in cheminformatics and drug discovery. The series provides a benchmark suite for comparing model performance across different chemical endpoints.
This collection comprises authoritative regional biodiversity profiles produced by Western Australian government departments. It covers geospatial data and statistical summaries of biodiversity assets, ecological features, and conservation priorities, organized by the IBRA (Interim Biogeographic Regionalisation for Australia) subregions across the state. The datasets are compiled by regional nature conservation staff with firsthand field experience, ensuring practical relevance for land management. Together, they provide a consistent, state-wide framework for comparing ecological assets and conservation needs across different biogeographic regions, supporting integrated environmental planning.
This collection provides longitudinal, market-level price data for staple commodities like maize, rice, beans, fish, and sugar across numerous countries, sourced from the World Food Programme. It supports trend analysis of food price inflation and volatility at a granular geographic level. Researchers can use this data to model the impact of economic shocks, compare regional price differentials, and develop food security indicators and early warning systems.
This collection covers near real-time, event-level data on new internal displacements, categorized by conflict or disaster triggers and standardized using the 1998 Guiding Principles. It provides daily updates and 180-day rolling windows for tracking population movements. The datasets support comparative analysis of displacement drivers and patterns across diverse geographic contexts. Together, they enable a consistent monitoring workflow for identifying emerging crises and informing rapid humanitarian response.
This collection covers standardized, aggregated figures for internally displaced persons (IDPs), sourced from the International Organization for Migration's Displacement Tracking Matrix. The data is consistently structured at national, Admin 1, and Admin 2 administrative levels, enabling subnational analysis. It supports comparative mapping of displacement density across different countries and crises. The standardized format allows for the integration of data from diverse geographic contexts into a unified analytical workflow.
This collection covers geospatial data on the Performance-Based Standards (PBS) tandem drive network for heavy vehicles in Western Australia, including route geometries, intersections, bridges, and access conditions. The data is provided in formats like GeoJSON and KML for integration with GIS systems and is updated weekly to reflect regulatory changes. It enables users to analyze approved routes, plan compliant logistics operations, and monitor network updates specific to PBS vehicle configurations.
This collection covers geospatial data on bridge locations, classifications, and attributes within Western Australia's Tandem Drive and Heavy Vehicle Services (HVS) networks. It includes details on concessional levels and access restrictions relevant to heavy vehicle compliance. The datasets are updated weekly by the authoritative road authority, ensuring current information for infrastructure and logistics analysis. Together, they enable the specific task of mapping and planning compliant heavy vehicle routes across the state.
This collection provides detailed geospatial data on the approved network routes, bridges, and access conditions for oversize divisible product vehicles, such as road trains and B-doubles, across Western Australia. The data includes route geometries, restrictions, and network conditions in formats like GeoJSON and KML, updated weekly to reflect current road access permissions. It supports the specific task of planning compliant and efficient transport logistics for the heavy vehicle sector within the state.
This collection comprises authoritative species range maps and habitat suitability data from the California Wildlife Habitat Relationships (CWHR) system. It covers geographic distribution data for a wide variety of bird species native to California, formatted as vector polygons for GIS analysis. These datasets are designed to be integrated with environmental and land use layers within the established CWHR predictive modeling framework. They provide the foundational spatial data required by state agencies and conservation planners for habitat suitability modeling, conservation impact assessments, and land use planning specific to California's ecosystems.
This collection provides standardized national-level indicators on agricultural productivity, inputs, and rural development, sourced from the World Bank and FAO. It covers metrics such as crop and livestock output, land and water use, and rural employment and income for a wide range of countries. The datasets enable comparative analysis of agricultural sector efficiency and rural economic trends across different national contexts. Researchers can use this harmonized data to benchmark performance and model the relationship between agricultural inputs and development outcomes.
This collection provides geospatial boundaries for protected and conserved areas worldwide, sourced from the official WDPCA (World Database on Protected and Conserved Areas). It covers both terrestrial and marine zones, integrating data from the WDPA and WD-OECM frameworks in formats like GeoJSON and GeoPackage. The data supports the specific task of monitoring national and global progress toward biodiversity targets, such as the Kunming-Montreal Global Biodiversity Framework's Target 3. It enables environmental risk assessment by allowing users to intersect project boundaries with conservation zones, and serves as a foundational layer for international reporting on Sustainable Development Goals.
This collection covers official International Federation of Red Cross and Red Crescent Societies (IFRC) records for humanitarian Emergency Appeals and Disaster Response Emergency Fund (DREF) allocations across numerous countries. It provides data on funding requirements, allocations, and disaster types for specific national operations. Together, these datasets support comparative research into the scale, frequency, and mechanisms of international humanitarian financing for disaster response.
This collection covers monthly forecasts of conflict events and fatalities with a 36-month horizon, standardized using the Humanitarian Exchange Language (HXL). It provides forward-looking data for global and country-specific risk analysis. The datasets support a consistent workflow for integrating predictive indicators into humanitarian response and early-warning systems.
This collection covers historical vehicle location data for Fire, Rural Fire Service, and State Emergency Service units operating in the Australian Capital Territory. The data spans from 2004 to 2016, providing a long-term view of operational movements and deployments. Together, these datasets enable analysis of spatial coverage, temporal patterns, and the evolution of emergency response strategies over more than a decade.
This collection comprises datasets from the QSAR-TID series, which are curated for Quantitative Structure-Activity Relationship modeling. These datasets typically contain molecular descriptors derived from chemical structures paired with measured biological activity or toxicity endpoints. Together, they support the development and validation of machine learning models that predict the biological effects of chemical compounds, a critical task in computational toxicology and early-stage drug discovery. The series provides a standardized resource for benchmarking model performance across diverse chemical and biological spaces.
This collection provides detailed geological data for assessing coalbed methane (CBM) potential across Alberta's major coal zones. It includes zone-specific maps and calculations for net coal thickness, gas content, and depth to top, all derived from a substantial common well database. These datasets support a comprehensive workflow for modeling methane reserves and identifying viable resource areas. The consistent data foundation allows for comparative analysis and integrated geological modeling across different coal zones within the region.
This collection covers admission prerequisites, program durations, and institutional contact details for nursing and midwifery programs across various Nigerian universities and colleges. It provides standardized data on required subjects, credit requirements, and training timelines for programs like General Nursing and Post-Basic specialties. The data enables systematic comparison of program structures and eligibility criteria across different institutions and geographic regions within Nigeria.
This collection comprises datasets from the QSAR-TID series, each pairing chemical compound features with associated biological activity endpoints. The data is standardized and hosted on the OpenML platform, facilitating direct use in machine learning workflows. Together, these datasets serve as a comprehensive benchmark suite for developing and evaluating Quantitative Structure-Activity Relationship models. Researchers can use this collection to test predictive algorithms across a variety of molecular targets and activity types.
This collection comprises high-frequency water quality measurements and sediment quality data from Canaveral National Seashore and similar semi-arid coastal regions. It covers time-series of parameters like water temperature, dissolved oxygen, salinity, turbidity, and specific conductivity, alongside sediment composition analyses. The data supports detailed investigations into diurnal cycles, seasonal variations, and the relationships between physical and chemical properties in coastal ecosystems. It provides a consistent, multi-year observational record for a specific protected coastal area.
This collection provides multi-decade, seasonal time-series satellite data for monitoring Australian ecosystems. It includes datasets on vegetation fractional cover, canopy height, land surface phenology, evapotranspiration, and surface reflectance, derived primarily from Landsat and MODIS sensors. The data supports longitudinal analysis of vegetation health, drought impacts, and land cover change at continental to regional scales. The consistent seasonal compositing and long temporal baselines enable robust trend analysis and modeling of ecosystem dynamics.
This collection comprises gridded elevation, thickness, and extent models for key geological formations and aquifers in Alberta, such as the Paskapoo, McMurray, and Empress Formations, and the Sunchild Aquifer. The data provides the foundational structural surfaces necessary for three-dimensional subsurface modeling. These standardized grids support integrated workflows in hydrogeology and petroleum geology, allowing for the analysis of groundwater flow paths, aquifer storage potential, and hydrocarbon reservoir geometry within a consistent spatial framework.
This collection covers granular operational data for Canberra's public transport network, including passenger boardings and alightings by stop, hour, and route; bus and light rail punctuality and on-time performance metrics; and service reliability figures. It provides insights into patronage patterns, ticket type usage, and the performance of bus and light rail services over time. The data supports detailed analysis of demand fluctuations, service efficiency, and network planning.
This collection covers standardized technical specifications and certified energy performance metrics for a wide range of ENERGY STAR labeled products, including commercial kitchen equipment, servers, displays, HVAC systems, and residential appliances. The data provides detailed columns on efficiency ratings, energy consumption, and key operational parameters as verified under the U.S. EPA's ENERGY STAR program. Together, these datasets support a comparative analysis workflow, allowing professionals to benchmark models across categories, verify compliance with specific program criteria, and identify the most energy-efficient options for procurement or building projects.
This collection covers the full legal texts of Canada's international agreements, including treaties on investment protection, trade, maritime safety, human rights, and legal cooperation. It provides authoritative documents from Global Affairs Canada for analyzing specific legal clauses, obligations, and dispute resolution mechanisms. Researchers can use this corpus to conduct comparative studies of treaty structures, track the evolution of Canada's international commitments across different policy domains, and model standard frameworks for international agreements.
This collection covers spatial and regulatory data for land withdrawals, mineral claims, water licenses, agricultural dispositions, and oil and gas leases within the Yukon territory. It provides authoritative government records on land status, tenure, and environmental permits. The datasets are interlinked by location and regulatory framework, supporting comprehensive analysis of land-use conflicts, resource development potential, and regulatory compliance. This integrated view is valuable for managing competing land interests and planning sustainable development in the region.
This collection covers verified records of security incidents targeting humanitarian personnel, as well as attacks on health and education facilities in active conflict zones. The data includes details on incident types, locations, dates, and often incorporates verification statuses or certainty levels. It supports analyses of geographic risk hotspots, temporal trends in violence, and the comparative safety of different operational environments. The datasets are sourced from specialized humanitarian monitoring initiatives, providing a focused view on threats to civilian aid delivery.
This collection covers authoritative spatial data for land administration in Victoria, Australia, including property parcel boundaries, Crown land tenure polygons, easement lines, and geographic features of interest. It provides the core cadastral framework used by local governments and planning authorities, with data on land ownership types, proposed and approved easements, and contaminated site registers. These datasets are designed to be integrated, supporting workflows that analyze land use patterns, track administrative changes over time, and ensure compliance with planning and environmental regulations.
This collection provides geospatial data on the paths, intensities, and human and agricultural impacts of tropical cyclones in Somalia from 1984 to 2020. It includes district and village-level geometries, storm track coordinates, and specific damage metrics such as displaced populations, destroyed infrastructure, and livestock losses. The multi-format datasets, sourced from regional climate authorities, support the creation of detailed historical baselines for risk assessment. By covering multiple events over decades, they enable the analysis of spatial patterns and temporal trends in cyclone vulnerability specific to the Somali context.
This collection covers granular data on student enrollment, school feeding programs, ICT equipment distribution, and academic performance across Colombian municipalities and departments. It includes longitudinal records on program beneficiaries, resource allocation, and test scores, enabling detailed sub-national analysis. The datasets are complementary, providing the necessary components to assess program reach, identify service gaps, and evaluate the impact of specific interventions like laptop distribution or meal programs on educational outcomes.
This collection covers official Transportation Safety Board of Canada investigation reports detailing freight train derailments, collisions, and other safety incidents across various provinces. The reports provide rich narrative text for analyzing causal factors, operational failures, and safety recommendations, alongside precise geospatial details like subdivision names and mileposts. Together, they support comprehensive risk modeling and pattern recognition tasks for railway safety research and operational improvement.
This collection covers geospatial data for water, sewer, and drainage infrastructure assets, including pipes, pumps, valves, and storage facilities, primarily sourced from the Water Corporation. The datasets are provided in multiple interoperable formats suitable for integration into GIS platforms. Together, they support the comprehensive mapping and network analysis required for asset management, maintenance planning, and capital investment programs within the utility's service area.
This collection consists of datasets curated for use with the textbook 'Analyzing Categorical Data' by Jeffrey S. Simonoff. The data covers a wide variety of topics, including medical, legal, financial, and social science examples, all formatted for categorical analysis. Each dataset includes explicit metadata, such as a class variable and index, to facilitate the application of specific statistical methods taught in the book. This makes the collection directly valuable for structured educational exercises and methodological practice in categorical data analysis.
This collection comprises datasets from the OpenML platform containing molecular descriptor data and associated biological or chemical activity endpoints. These datasets are foundational for Quantitative Structure-Activity Relationship (QSAR) modeling, a core technique in computational chemistry and toxicology. By aggregating these resources, the cluster supports the development and validation of predictive models that estimate chemical properties, toxicity, or bioactivity from molecular structure. This enables tasks such as virtual screening for drug discovery and environmental hazard assessment.
This collection covers geospatial datasets on public parks, community facilities, and recreational infrastructure such as boat ramps and dog parks across local government areas in Southeast Queensland, including Brisbane, Moreton Bay, and Noosa. The data includes land designations, facility locations, and asset inventories sourced from authoritative statutory planning instruments like local government infrastructure plans and city plans. It enables a cohesive regional analysis of public amenity distribution, supporting comparative studies of green space accessibility and service coverage gaps across multiple adjoining municipalities.
This collection provides geospatial data on wetland geomorphic classifications, river foreshore condition assessments, and floodplain infrastructure for Western Australia. It covers key environmental variables such as wetland host landform and hydroperiod, riverbank erosion and weed points, and bridge overtopping statuses for various flood probabilities. The datasets are derived from authoritative state assessments and studies, offering a consistent foundation for analyzing regional hydrological patterns. Together, they support a comprehensive workflow for modeling flood risk, monitoring ecosystem health, and prioritizing conservation and restoration actions across specific Western Australian landscapes.
This collection covers time-series geospatial data on California's farmland categories, including Prime Farmland and Grazing Land, as mandated by the state's Farmland Mapping and Monitoring Program. It provides biennial inventories with a consistent ten-acre minimum mapping unit, enabling the analysis of land use conversion patterns over several decades. These datasets are specifically designed to support urban planning, conservation policy assessment, and agricultural resource modeling within California.
This collection covers hyper-local directories of community action boards, municipal councilors, public entity contacts, and sports clubs across numerous Colombian departments and municipalities. The data includes structured fields for organizational names, leadership roles, contact information, and often geospatial coordinates. It supports comparative analysis of civic infrastructure density and local governance networks. Researchers can use this data to map the distribution of formal and informal community organizations against administrative boundaries.
This collection covers granular, subnational data on population demographics, social program beneficiaries, health coverage, pension systems, and cultural sectors across Colombia's departments and municipalities. It provides the localized indicators necessary for analyzing regional disparities and planning public services. The datasets are complementary, offering a multi-faceted view of community welfare and public resource allocation at the municipal level, which supports comparative policy analysis and program evaluation.
This collection comprises datasets from the OpenML platform focused on Quantitative Structure-Activity Relationships (QSAR). It includes molecular descriptor data paired with biological activity measurements, which are fundamental for modeling the interaction between chemical structures and their biological effects. The datasets are sourced from a standardized machine learning repository, facilitating consistent access and integration. This curated set supports the development and validation of predictive models in computational chemistry and drug discovery.
This collection covers soil landscape mapping, land capability ratings for various crops and horticulture, soil salinity and erosion risk assessments, and agricultural weather station data for Western Australia. It provides a foundational set of government-derived spatial data layers for analyzing physical land constraints and agricultural potential. The datasets are designed to be interoperable, supporting integrated models for land use planning, crop suitability analysis, and sustainable development assessment across the state's diverse agricultural and rangeland regions.
This collection provides geospatial data for hydrological modeling, ecological assessment, and infrastructure planning within Melbourne's urban water system. It includes catchment boundaries, flow routes, habitat suitability models for fish and macroinvertebrates, and asset locations for drains, retarding basins, and wetlands. The datasets are specifically tailored to support the integrated analysis and management tasks required under initiatives like the Healthy Waterways Strategy, offering a cohesive view of the region's water resources.
This collection comprises datasets curated for Quantitative Structure-Activity Relationship (QSAR) modeling. The data typically includes molecular descriptors, chemical fingerprints, and associated biological activity measurements or other molecular properties. These standardized datasets are hosted on the OpenML platform, facilitating direct use in machine learning workflows. Together, they support the training, validation, and benchmarking of predictive models central to computational chemistry and drug discovery research.