Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,692 datasets
Yulong Su published a 2.3 GB dataset on figshare in 2026 containing seismic waveform and event data. The collection includes SAC waveform files and Excel tables detailing 94 events for PKPPcP analysis and 57 events for PKPPcP–PKKPab phase pairs analysis, along with corresponding event-station pair information. The data is used to study de-degeneracy effects of specific seismic phases and implications for 3-D Earth's mantle heterogeneity.
Estimates to the nearest thousand of employed people in London who have more than one job. The data is derived from the UK Office for National Statistics' Annual Population Survey, with records starting from 2004. It is published by the Greater London Authority.
Data from 2002 onward, collected by the Atmospheric Infrared Sounder (AIRS) aboard NASA's Aqua satellite, provides calibrated, geolocated infrared radiances for approximately 2378 spectral channels, simulated for cloud-free conditions. These radiances are a fundamental input for retrieving standard atmospheric products and are generated globally at all observation points, with 240 data granules produced per day. The dataset's high spectral resolution (R=1200) and synergy with microwave sensors (AMSU/HSB) support detailed atmospheric profiling.
The Yukon Mineral Exploration Program (YMEP) is a funding program for mineral exploration in Yukon. It provides financial support to prospectors, partnerships, and companies across four modules for hard rock and placer resource projects. The dataset is published by the Government of Yukon and was last updated on April 17, 2026.
Registro_activos_informacion documents compliance with the Transparency and Access to Public Information Index (ITA) mandated by Colombia's Procuraduría General de la Nación under Law 1712 of 2014 and Resolution 1519 of 2020. The dataset is hosted on the Colombian open data portal www.datos.gov.co and was last updated on May 18, 2026. It likely contains metadata records describing published information assets.
Quan Zuo's dataset from 2026 presents a programmable platform using eight bi-triazine cross-linkers to establish structure–conformation–biology relationships for peptide therapeutics. The data likely contains results from binding assays, cell studies, and in vivo PET/CT imaging for cyclic RGD and dimeric KTLLPTP peptide models targeting integrin αvβ3 and Plectin-1. The work identifies a lead candidate with high tumor contrast and provides a framework for precision peptide engineering.
Aqua/AIRS L2 Cloud-Cleared Infrared Radiances are calibrated, geolocated infrared radiance measurements from the Atmospheric Infrared Sounder aboard NASA's EOS Aqua satellite. The data product contains channel-by-channel radiances for approximately 2378 spectral channels, processed to simulate cloud-free observations within each Advanced Microwave Sounding Unit footprint. It is generated as a separate, high-volume output from the AIRS Standard Product due to its size, with a temporal resolution of 6-minute granules and a 16-day orbit repeat cycle.
A study by Caroline Varella Rodrigues investigated in situ biomethanation as a biogas upgrading strategy. The data likely contains results from fed-batch reactors treating pulp and paper industry wastewater, with hydrogen injected at two pressures to evaluate methane production and microbial dynamics. The dataset was last updated on 2026-04-24.
A cross-sectional survey of 3,795 middle school students in Hangzhou, China, collected by Ruiyi Chen. The dataset includes assessments of emotional distress using DASS-21, gaming motives using MOGQ, and Internet Gaming Disorder symptoms using IGDT-10. Network analysis was performed to explore the interrelationships among these constructs and identify core and bridge symptoms.
Xu Wang's research document details the generation and characterization of monoclonal antibody 4H3 targeting the HN protein of pigeon paramyxovirus type 1 (PPMV-1). The study includes codon optimization, prokaryotic expression, antibody screening, and epitope mapping identifying the conserved DRVWF epitope. The dataset was last updated on April 24, 2026.
Violations issued by the New York City Department of Housing Preservation and Development against rental dwelling units and buildings for conditions violating the Housing Maintenance Code or Multiple Dwelling Law. The base data includes all violations open as of October 1, 2012, and is updated daily with new violations and status changes. Each row contains discrete information for a unique violation, identified by a ViolationID.
MMA AI Dataset Artifacts contains the database dumps and runtime artifacts needed to reproduce the mma-ai workflow from the companion code repository. The dataset was created by DanMcInerney and was last updated on 2026-06-05. It includes a custom-format PostgreSQL dump for the main mma-ai database containing the features schema.
113 father-adolescent and 132 mother-adolescent dyads participated in a 21-day daily diary study. Kaiwen Bi authored this dataset, which examines day-to-day emotional dynamics and intergenerational psychopathology transmission. The data was last updated on figshare in April 2026.
Companion training data for the LiteResearcher paper, which describes a low-cost, scalable Agentic RL training framework for deep-research agents. The dataset contains the two-stage curriculum of question–answer prompts used to train the LiteResearcher-4B model with on-policy GRPO+TIS against a local search/browse environment. Both curriculum stages share the same validation set.
Queensland, Australia bathymetry data acquired for the Australian Hydrographic Office between 13 August and 09 November 2022. The survey was conducted onboard the MV Sea Ranger and MV Brynda using Kongsberg EM2040D and EM 2040C systems. The processed data is provided as a 30-meter resolution, 32-bit floating point GeoTIFF grid.
A bathymetry survey of the Cape Fourcroy (West) area was conducted for the Australian Hydrographic Office between 18 August and 15 October 2023. Data was acquired using a Kongsberg EM2040 MKII multibeam echosounder and processed with CARIS HIPS and SIPS and QIMERA software. The final dataset is a 30-meter resolution, 32-bit floating point GeoTIFF grid.
A bathymetry survey of the Cape Leeuwin area in Western Australia was conducted for the Australian Hydrographic Office between 5 Dec 2022 and 4 Apr 2023. The data was acquired using a Kongsberg EM2040P multibeam echosounder and processed with CARIS HIPS and SIPS software. The final deliverable is a 30-meter resolution GeoTIFF grid of the seafloor depth.
A suite of tool marks were observed in the seaward section of a small estuary on the south coast of New South Wales, Australia. The marks, formed by wind-dragged Eucalyptus leaves and Casuarina fronds over backshore sands, closely resemble trace fossils left by fish, posing a challenge for interpretation in the sedimentary record. This study highlights the risk of misinterpreting such ambiguous grooves for palaeoflow direction, as they instead correspond to prevailing wind orientation.
Kangaroo Island (South-East) bathymetry data was acquired for the Australian Hydrographic Office between 1 December 2022 and 31 May 2023. The survey was conducted by Precision Hydrographic Services under the Hydroscheme Industry Partnership Program using Kongsberg EM2040 multibeam sonar systems. The processed dataset is provided as a 30-meter resolution, 32-bit floating point GeoTIFF grid.
A qualitative study investigates leadership, team cohesion, group identity, and coping mechanisms during Bulgaria's 33rd Antarctic Summer Campaign. The dataset includes 28 pre-departure and 36 post-return semi-structured interviews, ethnographic observations, and field notes. Martin Milanov authored this longitudinal study, which was last updated on April 24, 2026.