DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

NLP & Text Datasets | DataSalon

All Categories

📝

NLP & Text

Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora

49,414 datasets

NLP & Text

Experimental Data on Basalt Creep During Carbonation from CarbFix, Iceland

CarbFix site in Iceland provides the basaltic rock samples for this experimental dataset. The data was collected by Tiange Xing of MIT from load stepping creep experiments under various pore fluid conditions at approximately 80°C and 50 MPa effective pressure. It includes mechanical, acoustic, and pore fluid chemistry measurements to study time-dependent deformation for geological carbon storage.

TabularAudioGeologyRock mechanicsAcoustic EmissionsExperimental DataCarbon Sequestration+1

0 views

NLP & Text

Internet Access and Usage by Adults in Great Britain

Survey data from the UK Office for National Statistics explores how adults in Great Britain use the Internet. The release details household Internet connectivity and the types of activities and online purchases made by adults. It is designated as National Statistics and was last updated in July 2026.

TabularUk StatisticsInternet AccessConsumer behaviorHousehold Survey+1

0 views

NLP & Text

UK Listed Companies Share Ownership Survey by Beneficial Owner Type

The Share Ownership report from the Office for National Statistics details the beneficial ownership of UK listed companies as of 31 December. It is classified according to National Accounts categories and designated as National Statistics. The data is provided in HTML format.

TabularUk CompaniesShare OwnershipNational AccountsFinancial Markets+1

0 views

NLP & Text

Index of Production: UK Manufacturing, Mining, and Energy Output

UK production volume data measures output for manufacturing, mining and quarrying, and energy supply industries. The figures are seasonally adjusted and indexed to base year prices. It is produced by the Office for National Statistics as a National Statistics series.

Time Series🇬🇧 United KingdomEconomic IndicatorsProduction IndexSeasonally AdjustedManufacturing+1

0 views

NLP & Text

UK Civil Partnership Statistics on Formations and Dissolutions

Data on civil partnerships formed and dissolved in the United Kingdom, produced by the Office for National Statistics. The dataset is designated as National Statistics and is available in English. It was last updated on 2026-07-08.

TabularSocial StatisticsCivil PartnershipsUnited KingdomDemographics+1

0 views

NLP & Text

Landsat 5 Satellite Imagery for Great Barrier Reef and Torres Strait, 1988-2011

59 Landsat 5 satellite images processed into true color GeoTIFFs for selected areas of Queensland, including the Torres Strait and around Lizard Island and Cape Tribulation. The collection was developed by the Australian Ocean Data Network as part of the NERP TE 13.1 project and eAtlas AIMS, sourced from NASA archives. Images were selected from the full Landsat 5 archive for low cloud cover and clear water to facilitate marine feature investigation.

ImageGeospatialZIPGreat Barrier ReefSatellite ImageryComputer VisionMarine Environment+1

0 views

NLP & Text

WAOM4: High-Resolution Circum-Antarctic Ocean-Ice Shelf Model

The Whole-Antarctic Ocean Model (WAOM4) is a high-resolution (4-km) ocean-ice shelf simulation based on the Regional Ocean Modelling System (ROMS). It was used to investigate the physical drivers of Antarctic ice shelf basal melting for the present-day year 2007. The model was contributed by the Australian Ocean Data Network and last updated in July 2026.

GeospatialClimate ScienceOcean ModelingAntarctic Ice ShelvesGeospatial Simulation+1

0 views

NLP & Text

Torres Strait Landsat 5 Satellite Composite, 1993-2009

A composite of 8 Landsat 5 satellite scenes provides a cloud-free, clear-water seamless image of the Torres Strait region, including parts of Cape York and Papua New Guinea. The image has a resolution of approximately 30 meters and a positional accuracy better than 50 meters. The composite was created by the Australian Ocean Data Network using imagery selected from 1993 to 2009 and processed for sun glint, haze, and tonal consistency.

ImageGeospatialGeospatial CompositeSatellite ImageryComputer VisionMarine Environment+1

0 views

NLP & Text

Pigment Sampling in Coastal Waters of South Eastern Tasmania

Water samples for pigment analysis using High Performance Liquid Chromatography (HPLC) were collected in the first year of a sampling program. The data likely contains diagnostic pigment concentrations used to estimate algal community composition and concentration. The dataset is provided by the Australian Ocean Data Network and was last updated in July 2026.

TabularPhytoplanktonPigment AnalysisTasmaniaCoastal OceanographyMarine Biology+1

0 views

NLP & Text

Australian iPCoD: Baleen Whale Population Impact Model for Offshore Wind Development

Australian research developed an interim Population Consequence of Disturbance (iPCoD) model for blue whales and southern right whales. The model provides a framework to assess population-level impacts of offshore wind farm developments along the southern Australian coast. This work, associated with the NESP MaC 4.9 project, highlights key data gaps for threatened populations overlapping with declared offshore wind areas.

Tabular🇦🇺 AustraliaMarine mammalsBenchmarkEnvironmental Impact AssessmentWind EnergyPopulation Modeling+1

0 views

NLP & Text

U-Th Dated Coral Samples from Mazie Bay, Great Barrier Reef, 6900 Years to Modern

Mazie Bay on North Keppel Island in the Southern Great Barrier Reef provides the location for this dataset of 117 coral samples dated using Uranium-Thorium methods. The data, collected in 2012 and 2013 from reef matrix cores and death assemblages, spans an age range from 6900 years before present to modern. It was aggregated by the Australian Ocean Data Network and last updated in July 2026.

TabularTime SeriesGreat Barrier ReefU Th DatingGeochronologyPaleoecologyCoral ReefLarge Scale+1

0 views

NLP & Text

VCMP Sites - Shoreline Timeseries: Victorian Coastline Data 1930-2023

Shoreline timeseries data from the Victorian Coastal Monitoring Program spans from 1930 to 2023 across sites along Victoria's coastline. The dataset includes short-term and long-term shorelines derived from multiple sources, including UAV surveys, satellite imagery, and aerial photography. It is provided by the Department of Energy, Environment and Climate Action.

Time SeriesGeospatialShoreline MonitoringCoastal GeomorphologySynthetic+1

0 views

NLP & Text

Seafloor Map of the Flinders Marine Reserve from Multibeam Sonar

A multibeam sonar-based map details the shelf-break region of the Central Flinders Commonwealth Marine Reserve. The data illustrates canyon-head incisions and a cross-shelf reef area characterized by distinct ledges often 1-2 meters high. The map was produced for the National Environmental Research Program and is hosted by the Australian Ocean Data Network.

AudioGeospatialMultibeam SonarGeospatial AnalysisSeafloor MappingMarine ReserveBathymetry+1

0 views

NLP & Text

Bio-physical Predictors of Coastal Land Use Change in the Great Barrier Reef, 1999-2009

Four key bio-physical predictors, including slope and rainfall, were identified as the strongest drivers of land use change along the Queensland coast. This dataset contains raster files from the NERP TE 9.4 project, modeling transitions between 18 land use classes at 50m and 500m resolution. The analysis was conducted by the Australian Ocean Data Network using a neural network model trained on data from 1999 and 2009.

GeospatialGreat Barrier ReefLand Use ChangeCoastal ZoneGeospatial AnalysisBiophysical Predictors+1

0 views

NLP & Text

Victorian Shoreline Profile Timeseries from 1930 to 2023

Shoreline profile timeseries from 1930 to 2023 across sites monitored by the Victorian Coastal Monitoring Program. Data includes short-term and long-term shorelines derived from UAV surveys, satellite imagery, and aerial photos, with vertical uncertainty for UAV DSMs estimated at 0.1 m. The dataset is provided by the Department of Energy, Environment and Climate Action and was last updated in June 2026.

Time SeriesGeospatial🇦🇺 AustraliaShoreline MonitoringCoastal GeomorphologySynthetic+1

0 views

NLP & Text

Aerial Photo Mosaic of Atherton Tablelands from June 1978

A 13,792x12,623 pixel photo mosaic of historic aerial imagery covering the southern Atherton Tablelands from 16th June 1978. The mosaic was compiled by Atherton Tablelands Geographical Information Services in April 2012 from data provided by the Queensland Department of Environment and Resource Management to assist the NERP-TE project on rainforest restoration. It has a spatial resolution of 2.75 meters per pixel and covers towns including Atherton, Malanda, Yungaburra, and Tolga.

ImageGeospatial🇦🇺 AustraliaZIPAerial ImageryRainforest RestorationHistorical Photos+1

0 views

NLP & Text

Service Station Trip Generation and Parking Demand Survey Data

Weekday manual classification counts between 6:00 am and 7:00 pm for trip generation and parking demand at service station survey sites. The data was collected by the NSW Government, with one site also surveyed for 24-hour vehicle movements over a 7-day period using video monitoring from 7:00 pm to 6:00 am. The report provides base data for practitioners to make direct comparisons with similar sites.

TabularTrip GenerationParking DemandService StationsTransportation PlanningTraffic Survey+1

0 views

NLP & Text

C12 0 Flood Prone Hazard Areas Code: Hobart Flood Risk Model

Potential Flood areas for all Hobart catchments, modeled for the year 2100 under a 1% Annual Exceedance Probability scenario. The dataset covers major waterways like the Hobart, New Town, and Sandy Bay rivulets, as well as minor drainage lines. It is intended as a general planning guide and requires site-specific investigation for property-level accuracy.

GeospatialZIPCSVFloodEnvironmentalFlood PronePotentialDevelopmentUrban PlanningHazard ModelingPlanningInundationOpen Data+1

0 views

NLP & Text

Lough Melvin And Arney: High Network Contribution Areas

Network contribution area scores derived from SciMAP application outputs within SAGA GIS identify zones with high hydrological influence. The scores integrate land cover data from CEH Land Cover 2007, average rainfall from the Met Office, and a 5-meter Digital Terrain Model. High contribution boundaries are defined statistically as areas exceeding one standard deviation above the mean score distribution.

GeospatialCSVTextJSONExcelSurface Water ConnectivityHydrologyTopographyRainfallHigh Network ContributionEnvironmental NetworkLand Cover+1

0 views

NLP & Text

Six Mile Water Network Contribution: High Surface Flow Areas in Northern Ireland

Northern Ireland's Six Mile Water network contribution area scores identify zones with high potential for surface water flow into the river network. The dataset was generated in SAGA GIS using SciMAP, based on inputs from the CEH Land Cover 2007 map, Met Office average rainfall, and a 5-meter Digital Terrain Model. The 'High contribution' category boundary is defined as areas scoring above +1 standard deviation from the mean of the calculated scores.

TabularGeospatialCSVTextJSONExcelSurface Water ConnectivityHydrologyNorthern IrelandContribution AreaRainfallLand CoverGeospatial AnalysisNetwork ContributionGIS analysis+1

0 views

PreviousPage 31 of 2465Next