Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,671 datasets
Australian Ocean Data Network hosts data on the Prydz Channel Fan, a trough mouth fan on the Antarctic continental slope. The stratigraphy, derived from ODP Site 1167, indicates the bulk of the fan was deposited prior to the Brunhes-Matuyama Boundary (780 ka). This record documents the history of extreme advances of the Lambert Glacier-Amery Ice Shelf system.
A 100-meter-resolution bathymetric grid compiles seafloor depths in the Cape Darnley region of East Antarctica. The Australian Ocean Data Network integrates single-beam, multibeam, and chart data from numerous institutions over the last three decades. This compilation enables detailed visualization of seafloor morphology for scientific modeling.
Authoritative parcel polygons for Sioux Falls, South Dakota, maintained by the City of Sioux Falls. The geometry is the city's responsibility, while parcel attributes are shared with Minnehaha and Lincoln Counties. The legal assessment date is November 1st of each year.
hotchpotch/japanese-reranker-v2-hard-negatives-scores contains unified hard-negative score rows used to train the Japanese Reranker v2 family. The dataset stores teacher scores as floating-point soft labels and follows the HPPRC Emb Score style, with scores and referenced document text stored separately. It was last updated on 2026-06-03.
Alberta Health and Alberta Health Services began seasonal monitoring for cyanobacterial blooms at high-use recreational beaches in 2009. The dataset contains water sample records organized into two files for bloom indicators and individual species counts, updated monthly from June to September. Data from the current year is considered preliminary pending further quality control.
550,000 square kilometres of north-eastern Australia are covered by the Carpentaria Basin hydrogeological inventory. The dataset, provided by the Australian Ocean Data Network, contains descriptive attributes grouped into themes like location, geology, hydrogeology, and groundwater management. Its stratigraphy comprises sandstone-rich units deposited from the Late Jurassic to Mid Cretaceous, forming four major sub-basins.
OMI/Aura Global Geometry-Dependent Surface LER (OMGLER) provides geometry-dependent Lambert-equivalent surface reflectance (GLER) and derived top-of-atmosphere radiance for the Ozone Monitoring Instrument's field of view. The dataset includes ancillary input parameters for each pixel and is intended to supply surface reflectance information for cloud, aerosol, and trace gas retrieval algorithms. Data is provided in netCDF format for the daylit portion of each orbit, with approximately 14 files generated per day.
OMI/Aura Global Geometry-Dependent Surface LER (OMGLER) provides geometry-dependent surface reflectance and computed top-of-atmosphere radiance for the Ozone Monitoring Instrument's field of view. The product is designed to replace climatologies and serves as a key input for cloud, aerosol, and trace gas retrieval algorithms. Each file contains data for the daylit portion of a single orbit, with approximately 14 files generated per day.
Supplementary files for "Quantification of particle velocities and energy regime in an aeolian abrasion chamber" by Joanna Bullard, last updated April 2026. The dataset includes 78.9 MB of files supporting a study that measured two-dimensional velocity components for 17 different sand samples in a laboratory abrasion chamber. Particle velocities and energy regimes were quantified using a laser Doppler anemometer for air inflow rates up to 14.9 m sโปยน.
Evaluation reports for Global Affairs Canada's Womenโs Voice and Leadership Program. The department uses these reports as a practical management tool to review program performance and improve the design of future initiatives. Each evaluation results in a generated report, with the latest metadata update recorded on 2026-05-21.
289.8 MB of supplementary materials supports research on composite spatial indices for heat vulnerability. The package contains geospatial data inputs, calculated sub-indices, and final composite indices for Adelaide, Cairns, and Newcastle. Ryan Turner authored this data, last updated in April 2026.
Alexandre Spits published this dataset on figshare in April 2026. It contains MATLAB figures and data from experiments on an electronic circuit emulating a Duffing oscillator, recorded using a dSPACE MicroLabBox. The 985.8 MB dataset supports the article 'Measuring Nonlinear Resonances using Extended Arclength Control-Based Continuation'.
WithinUsAI created a synthetic distillation dataset in May 2026. It contains 5,000 unique examples designed to mirror the reasoning style of Meta's Muse Spark frontier model. The dataset is structured to teach a step-by-step reasoning process of Understand, Plan, Execute, and Verify.
A spreadsheet of data extracted from the second criminal instance of the Rio de Janeiro State Court. The data refers to collegiate decisions issued on Habeas Corpus petitions filed against pretrial detentions. The dataset was created by Gabriel Santos and last updated on 2026-05-18. Its objective is to provide insight into the decision-making behavior of the Rio de Janeiro judiciary regarding the precautionary instrument of pretrial detention.
8,589 books across 40 categories of classical Islamic sciences form this Arabic text corpus. The dataset was extracted from the al-Maktaba al-Shamela v4 database by AuthenticIlm on 2026-04-26, containing approximately 7.6 million pages of text.
A dataset from figshare, authored by Nada Mosallam and last updated on April 23, 2026, describing lead optimization efforts for anti-tuberculosis squaramide compounds. It likely contains data on compound analogues, their potency measured in nanomolar (nM) concentrations, and metabolic stability metrics. The dataset is small, at 4.5 KB, and is available in CSV format.
Binomial regression analysis data for risk factors associated with sire impact on Leishmania venereal transmission. The dataset includes three models (A, B, C) with explanatory variables such as sire diagnostic status, age, and puppy sex. It was authored by Kayla R. Duxbury and last updated on 2026-05-04.
9.5 KB of calculated quantum chemical properties for compounds from the plant Pinellia ternata. The dataset includes total energy, enthalpy, Gibbs free energy, hardness, softness, and electron affinity, computed at the B3LYP/6-31+G(d,p) level of theory. It was authored by Guoqiang Bian and last updated on May 18, 2026.
Geoscience Australia developed a Probabilistic Tsunami Hazard Assessment (PTHA) for the Gladstone region from Agnes Waters to Yeppoon. The report details modelling validated against three historic tsunami events and provides conservative inundation zone estimates corresponding to Australian tsunami warning categories. It was published via the Australian Ocean Data Network in 2026.
Hydra vulgaris strain 105 genome v3, with NCBI RefSeq GCF_022113875.1, was analyzed for transposable elements. The dataset includes a custom repeat library from RepeatModeler and GTF-format annotations from RepeatMasker, created by Aide Macias and last updated in May 2026. The 155.4 MB dataset was used in a study on dynamic transposable element expression during Hydra head regeneration.