Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,712 datasets
Nyan is a Japanese text-to-speech dataset generated by Irodori-TTS-600M-v3-VoiceDesign and formatted for fine-tuning the LFM2.5-Audio-1.5B model. It contains 9,322 training samples and 491 validation samples. The dataset was authored by RikkaBotan and last updated on June 6, 2026.
ERBE_S4G_WFOV_NF_ZG_1 contains Earth Radiation Budget Experiment data processed into zonal and global averages. The dataset provides monthly, daily, and hourly averages for parameters like albedo and radiation flux, derived from non-scanner, wide field-of-view instruments on three satellites. NASA's Langley Research Center Atmospheric Science Data Center (LARC_ASDC) produced this completed collection, with data represented as 8-bit, 16-bit, and 32-bit integers in HDF format.
Global satellite data from the Earth Radiation Budget Experiment (ERBE) provides regional averages of radiant flux and albedo on a 5-degree grid. Measurements were taken by instruments on NASA's ERBS and NOAA-9 and NOAA-10 satellites, covering latitudes between 67.5 degrees north and south. The dataset includes monthly, daily, and hourly averages processed using a numerical filter technique.
ERBE_S4GN_WFOV_SF_ZG_1 contains gridded, monthly-averaged estimates of radiant flux at the top-of-the-atmosphere from the Earth Radiation Budget Experiment. The data is derived from non-scanner instruments on three satellites and is organized by parameter across 10.0-degree zonal and global regions. This dataset provides processed measurements without scene identification from the scanner, using a numerical filter cross-track enhancement.
AVIRIS-NG L1B data provides orthocorrected radiance images captured from aircraft, measuring reflected light at 5-nm intervals across the 380-2510 nm spectral range. For each flight line, the dataset includes calibrated radiance images, geometric lookup tables, and files detailing observation geometry and illumination parameters. The sensor's 1 milliradian field of view yields ground sampling distances from 20 meters down to sub-meter resolution, depending on flight altitude.
Informe Productividad - Prestación De Servicios Promoción Y Prevención- 16 Ips Adscritas A Red Salud Casanare Ese – Enero 2019 a Noviembre 2019 lists patients attended by the Red Salud Casanare ESE network across various health promotion and prevention services. The dataset includes 16 healthcare institutions (IPS) and covers the period from January 1 to November 30, 2019. It was published on the Colombian open data portal www.datos.gov.co.
Four intensive field observation periods from 1986 to 1992 form the core of the First ISCCP Regional Experiments (FIRE). This dataset includes Doppler radar images from the second cirrus IFO in Coffeyville, Kansas, collected from November 13 to November 29, 1991. The project was designed by NASA's LARC ASDC to improve cloud and radiation parameterizations for general circulation models.
June to October 1961, a Bureau of Mineral Resources field party mapped the Mount Liebig and Mount Rennie sheet areas. This reconnaissance aimed to connect geological work in the MacDonnell Range with surveys on the Macdonald and Rawlinson sheet areas. The dataset likely contains geological observations and mapping results from this expedition.
Explanatory notes document geological reconnaissance and gravity surveys conducted in the Cornish Sheet SF5201 area. The Bureau of Mineral Resources performed reconnaissance in 1955, and gravity readings were taken by West Australian Petroleum Pty Ltd. Astrofixes and levelled heights were observed along the Canning Stock Route in 1956, and further gravity observations were made during a helicopter survey in 1957.
National Capital Region orthophotographs provide 20 cm spatial resolution imagery in natural color, captured in spring 2023. The mosaics cover the MRC Portneuf and part of the MRC de La Jacques-Cartier, available in JPEG 2000 and GeoTIFF formats.
Spring 2023 and 2024 aerial imagery mosaics covering the Joliette, Montcalm, Matawinie, and D'Autray regional county municipalities. The data includes GeoTIFF files in 2 km x 2 km tiles and JPEG 2000 files organized by municipality, with a spatial resolution of 20 cm in natural color.
Orthophotograph mosaics assembled from spring 2023 imagery, corrected for terrain relief. The data covers all municipalities and regional county municipalities in Québec's Estrie administrative region. Images are provided as 5 km x 5 km GeoTIFF tiles in natural color.
Spring 2023 orthophotograph mosaics cover the Montérégie administrative region of Québec. The data consists of geographically positioned natural color images with a spatial resolution of 20 centimeters, distributed as 5 km x 5 km GeoTIFF tiles. These mosaics were produced through regional partnerships and are available under a Creative Commons license.
A benchmark set of 75 items for evaluating language models on complex, multi-constraint instructions, created by SurgeAI. Each item is a realistic prompt paired with 10–40 evaluation criteria, totaling 1,559 criteria for rubric-based grading. The dataset was last updated on June 3, 2026.
Integrated urban development concept for the city centre of Bottrop demarcates planned redevelopment areas ‘Rathausviertel’, ‘Hansaviertel’ and ‘Western inner city’. The data originates from the Bundesamt für Kartographie und Geodäsie and is published under a CC0-1.0 license. It is available as a WMS service.
A list of young beneficiaries of the Generación E program during the first and second semesters of 2019 and 2020 at the Unidades Tecnológicas de Santander. The dataset includes columns for biological sex, year, program methodology, initial identification document, program name, and program modality. It is hosted on the Colombian open data portal www.datos.gov.co and was last updated in May 2026.
A 1961 paper prepared for the Institution of Engineers, Australia, summarizing information gathered on underground water occurrence in the Australian Capital Territory. Geologists from the Canberra Engineering Geology Group of the Bureau of Mineral Resources conducted systematic investigations over the past seven years. The inland part of the Territory covers 880 square miles of the tableland and alps of southeastern Australia.
Renee C. Strauch's dataset, shared via figshare in April 2026, quantifies shifts in polyphenolic composition from wild blueberries after 72-hour in vitro fermentation. It compares metabolites produced by Lactiplantibacillus plantarum and Bacillus subtilis strains to those reported from human gut-derived catabolism. The 317.8 KB XLSX file contains profiles of flavonoids, anthocyanins, proanthocyanidins, phenolic acids, and their conjugates.
A bathymetry and backscatter survey was acquired by the NSW government onboard the RV Bombora from 31 Aug 2022 to 31 Jul 2023. The dataset contains 32-bit floating point geotiff files at 5-meter resolution, processed using Hypack, R2Sonic GUI, POSView, POSPac, Qimera, and FMGT software. It was created to provide a baseline dataset and map the spatial distribution of seabed types within the Solitary Islands Gumbaynggirr Yaegl Marine Park.
Southeast Australia case studies from Adelaide and Old Bar beaches present a framework for modelling shoreline erosion from clustered storm events. The data integrates coastal geomorphology and engineering approaches, using sediment compartments and sub-surface information like boreholes and ground-penetrating radar. This work is a contribution to the Bushfire and Natural Hazard Cooperative Research Centre project on storm surge resilience.