Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,649 datasets
Six distinct sedimentary cycles, each hundreds of metres thick, have been identified in Australia's Surat Basin. The cycles, spanning the Jurassic and Cretaceous periods, are interpreted as responses to global sea-level oscillations characterized by rapid falls and slow rises. This dataset from Geoscience Australia documents the depositional environments, from braided streams to shallow marine settings, and correlates them with nine global sea-level events.
1982 reconnaissance maps of the Great Barrier Reef's surface cover facies, compiled by the Bureau of Mineral Resources. The maps differentiate supratidal, intertidal, and subtidal zones based on a simple bathymetric classification. Cartography was completed by G.A. Young and R.A. Swoboda, with editing in 1983.
A 2004 voyage by the RV Southern Surveyor collected geological data from the Kenn Plateau off northeast Australia. The survey gathered 3090 km of seismic data and 7584 km of bathymetric data, alongside a limited number of rock samples. This data aims to improve the geological understanding of a poorly known part of Australia's marine jurisdiction.
Pre-tokenized data snapshots used to study language model scaling laws in the data-constrained, compute-rich regime. The dataset consists of packed GPT-2-tokenized sequences stored in .pt files, as referenced in the paper 'Data-Constrained Language Model Pretraining: Improved Regularization and Scaling Laws'. It was uploaded by user zhiwei555 to Hugging Face and last updated on June 8, 2026.
Timor Sea continental shelf sediments are mapped in a series of 1:1,000,000 lithofacies sheets published by the Bureau of Mineral Resources. The work results from systematic reconnaissance surveys initiated after 1967, with three sheets printed by early 1974 and two more in progress. Users must refer to the accompanying Bulletin 83 for interpretation, as the map does not distinguish modern from relic sediments.
Keppel Bay's seabed reveals a complex history shaped by Late Quaternary sea-level changes and the Fitzroy River's past course. The palaeo-Fitzroy River once flowed west across the continental shelf to a point now under approximately 60 meters of water. Geoscience Australia Data compiled this information, last updated in April 2026.
Geoscience Australia Data provides a geological description of the Proterozoic Davenport province in central Australia. The dataset details sedimentary, volcanic, and intrusive rock sequences, including the 1870 Ma Warramunga Group and the folded Hatches Creek Group. It includes information on mineralisation, with recorded production of about 4500 tonnes of tungsten concentrates and 15 kg of gold.
Grandes Generadores de Residuos - UAESP lists locations generating the most frequent collection waste in Bogotรก. The dataset is provided by www.datos.gov.co and was last updated on 2026-05-18. It includes columns for concessionaire, address, and geographic coordinates.
NASA's HelioWeb provides daily heliocentric trajectory data for the Voyager 2 spacecraft. The data is derived from JPL Horizons ephemeris and is calculated in Heliographic, Heliographic Inertial, and Solar Ecliptic coordinate systems using the 'Mean of Date' method for the Equinox Epoch. The dataset was last updated in March 2026.
Geoscience Australia's Stavely Project release includes multiple geochemical analysis types such as whole rock, sulphur isotope, and Mobile Metal Ion. Data files are provided in PDF, ZIP, and HTML formats, with the last update recorded on 2026-05-06. This release also contains interpretive reports on hydrocarbon geochemistry, chromite petrology, and pyrite characterisation.
285 seabed sediment samples were collected from inner Darwin Harbour and shallow water areas in and around Bynoe Harbour between 29 May and 16 August 2017. The surveys were conducted by Geoscience Australia, the Australian Institute of Marine Science, and the Northern Territory Government as part of a four-year science program to collect baseline environmental data. The data includes grain size, inorganic elemental analyses, organic matter measures, and seagrass observations for habitat mapping.
Thirteen genotypes of greater yam were studied over two consecutive seasons in a field-based experiment. The dataset, published by Komivi Dossa on figshare in 2026, likely contains measurements of flowering, agronomic, morphological, and tuber quality traits under short-photoperiod, long-photoperiod, and hormonal treatments.
Experimental data from a two-season study on thirteen genotypes of greater yam (Dioscorea alata L.). The dataset likely contains measurements of flowering induction, agronomic, morphological, and tuber quality traits in response to short and long photoperiods and hormonal applications. The data was published by Komivi Dossa on figshare in April 2026.
2,369 injured adult skiers and snowboarders were studied across three winter seasons at two resorts in Zhangjiakou, China. The data, authored by Xiaoyu Ling and published on figshare in 2026, includes 339 severe injuries (14.3%) and models associations with protective gear, behavior, skill level, and environmental conditions.
A post-cruise report summarizes the preliminary results of the 1996/97 Antarctic marine geoscience program. The Australian Geological Survey Organisation (AGSO) and Antarctic Co-operative Research Centre conducted a geophysical survey and sediment sampling program in Vincennes Bay, Prydz Bay, and the Mac.Robertson Shelf. The voyage collected seismic data, sidescan sonar records, and sediment cores to study ice sheet retreat timing and Holocene paleoenvironmental records.
Higher-level penalty charge notices issued by the North Essex Parking Partnership for non-payment within the initial payment window. The dataset is provided by the Government Digital Service via the EU Open Data platform under the Open Government Licence. The last update date and exact data volume are unknown.
Penalty Charge Notices issued by the North Essex Parking Partnership for non-payment within an initial payment window. The dataset is provided by the Government Digital Service under an Open Government Licence and is available in multiple structured formats. Further details are available in the organization's annual reports.
Pay multiple data from UK local authorities, covering all taxable earnings including base salary, bonuses, and benefits-in-kind. The dataset is provided by the Government Digital Service under the UK Open Government Licence and is available in multiple formats including XML, HTML, CSV, and JSON. The median earnings of all employees on a fixed annual date is used as the denominator, excluding pension benefit changes.
Harisundar R developed PALL, an open training corpus for a dental-domain Llama-3.1-8B model. It contains three subsets totaling at least 405,959 text entries, designed to cover the full CPT โ SFT โ DPO post-training pipeline. The dataset was last updated on 2026-06-12.
Quebec's TOPEX dataset provides a raster index of topographic exposure to wind, calculated from NASA SRTM terrain data. The matrix file has a spatial resolution of 50 meters and covers the territory south of 52ยฐ40' and west of 61ยฐ10'. It was created by the Government of Quebec for windthrow hazard assessment.