Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,714 datasets
Registries of patient diagnoses from service-providing health institutions of the Empresa Social del Estado Pasto Salud E.S.E. The dataset includes columns for consultation code, patient sex, primary diagnosis code, healthcare provider, consultation date, age, description, and consultation name. It was last updated on 2026-05-18 19:06:43 and is hosted on the Colombian open data portal www.datos.gov.co.
West African atmospheric data from the 2006 NASA African Monsoon Multidisciplinary Analyses campaign. The dataset contains measurements from the High Altitude MMIC Sounding Radiometer, a 25-channel cross-track scanner with three frequency bands for temperature and humidity sounding. It was collected from aircraft based in the Cape Verde Islands to study African Easterly Waves and Mesoscale Convective Systems.
Public raw data for FOMO AI Accounting is a 390.6 KB CSV file by Gunawan Wibisono, last updated on 2026-04-28. The dataset supports a study examining heterogeneous user responses to AI-generated accounting information in digital investment environments. The research distinguishes between cognitive and affective fear of missing out (FoMO) to explain these responses.
High-fidelity post-mortem diagnostic samples map real-world infrastructure failures in e-commerce ecosystems to expert root-cause analyses and code patches. The dataset was created by Amman-shah and last updated on June 7, 2026. Data assets were ingested from live production stream architectures.
This dataset provides a time series indicator for the Pacific Decadal Oscillation (PDO) derived from satellite radar altimeter observations. The indicator is calculated using cyclostationary empirical orthogonal functions (CSEOFs) applied to gridded sea surface height anomalies from missions including TOPEX/Poseidon, Jason series, and Sentinel-6. It is produced and maintained by POCLOUD for NASA, with the file being updated to contain the most recent data.
Barranquilla, Colombia's audit findings from the District Comptroller's General Audit Plan for the 2019-2020 fiscal periods. The dataset likely contains records of disciplinary, fiscal, administrative, and penal findings determined during audits. It was published via the Colombian open data portal, datos.gov.co, and was last updated on May 18, 2026.
Fermi's Large Area Telescope provides a successor to EGRET with improved sensitivity, resolution, and energy range. This catalog presents the second full listing of LAT sources, derived from the first 24 months of survey data. The National Aeronautics and Space Administration (NASA) produced this catalog, with a detailed explanation available in the referenced LAT 2-year Catalog Paper.
Version 3.0 of the calibrated data set from the New Horizons spacecraft's Linear Etalon Imaging Spectral Array (LEISA) instrument during the Pluto encounter phase. The data includes observations from the Approach (January-July 2015), Encounter, Departure, and Transition sub-phases, completing all Pluto mission phase deliveries for LEISA. Updates include data downlinked through late October 2016, multi-map approach observations, moon observations, high-resolution departure data, and calibration campaign tests.
Uncalibrated observations from the JUNO spacecraft's Jupiter Energetic-Particle Detector (JEDI) instrument. The data is organized into daily files by spacecraft event time (SCET) and includes measurements from three instrument subsystems, each with six look directions. NASA produced this dataset, which was last updated in March 2026.
PREFIRE Satellite 2 raw curated radiance data contains raw digital number counts from a 63-channel push broom spectrometer measuring mid- and far-infrared radiation from approximately 5 to 53 µm. The experiment aims to fill knowledge gaps in the global energy budget by characterizing polar far-infrared emissions, which have not been measured on a large scale. This data, with a 0.707-second temporal resolution, is intended for assimilation into global circulation and climate models to improve future climate predictions.
October 16-20, 2016, bathymetric surveys captured water depths and surface elevations in the main channel of Louisiana's Wax Lake Delta. The data consists of continuous 1 Hz in-situ measurements integrated with GNSS location data, collected during the Pre-Delta-X Campaign. These measurements were used to generate a merged digital elevation model (DEM) for the delta's channels.
Beaufort Sea observations from August to October 2022 capture the transition from summer melt to autumn ice advance. The dataset contains in situ near-surface atmospheric and upper ocean measurements collected by a Jet Surface Salinity Profiler, a remotely operated kayak traveling up to 5 knots. Data aim to understand how salinity anomalies from melting ice affect sea surface temperature, stratification, and subsequent ice growth.
Beaufort Sea observations from September 10 to October 22, 2022, collected by an autonomous under-ice float during the NASA SASSIE field campaign. The dataset contains in situ measurements of ocean temperature, salinity, and acoustic range, targeting the transition from summer melt to autumn ice advance. Data are provided in netCDF format.
A 2-meter resolution bathymetry digital elevation model underpins this seabed feature classification for Zeehan Marine Park. The University of Tasmania compiled the data for Parks Australia using semi-automated GIS mapping tools. The dataset applies a nationally consistent seabed geomorphology classification scheme.
NASA's TERRA satellite provides a high-resolution, daily global record of Aerosol Optical Depth (AOD) from 2000 to the present. The dataset merges retrievals from the Deep Blue and Dark Target MODIS algorithms onto a consistent 0.1 x 0.1 degree grid. It includes averaged AOD values, standard deviation, pixel counts, and sensor zenith angle for quality filtering.
Groningen's regional economy is analyzed through a report and presentation. The data likely contains qualitative assessments of economic growth, talent surplus, and sector performance from 2009 onward. The dataset is provided by the Dutch Ministry of the Interior and Kingdom Relations under a CC-BY-4.0 license.
Survey data from the Groningen City Panel reveals that 12% of respondents experienced name-calling based on sexual orientation in the past year. The dataset, published by the Dutch Ministry of the Interior and Kingdom Relations, captures divided opinions on safety and acceptance of homosexuals, bisexuals, and transgender people in the city, province, and wider Netherlands. It is available under a CC-BY-4.0 license.
Beagle Marine Park in south-eastern Australia contains geospatial seabed morphology and geomorphology data. The dataset was created using a nationally consistent two-step classification system applied to bathymetry digital elevation models at 30 m and 1 m resolutions. It was produced by Geoscience Australia and collaborators, with references to reports and data products from 2020 to 2023.
Exoneración en matricula 2-2019 lists students from the Universidad Pedagógica y Tecnológica de Colombia (UPTC) who were exempted from tuition payment for the first semester of 2019. The dataset includes 14 columns detailing student demographics, academic programs, and the reasons for exemption. It is hosted on the Colombian open data portal www.datos.gov.co and was last updated in May 2026.
Nemotron-SFT-Science-v2 is a science reasoning dataset created by NVIDIA and last updated on June 4, 2026. It contains problems and solutions across three domains: Physics, Biology, and Chemistry. The dataset includes synthetic and vendor-sourced problems in multiple-choice and open-question formats, paired with LLM-generated solutions using chain-of-thought, Python, and search tool reasoning.