DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

NLP & Text Datasets | DataSalon

All Categories

📝

NLP & Text

Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora

49,576 datasets

NLP & Text

Gaussian Approximation Potentials for Iron from First-Principles Molecular Dynamics

Daniele Dragoni from École Polytechnique Fédérale de Lausanne created a Gaussian Approximation Potential model for the α-phase of iron. The model was trained on energies, stresses, and forces derived from first-principles molecular dynamics simulations of pristine and defected bulk systems, surfaces, and γ-surfaces. This dataset provides a machine-learned interatomic potential designed to describe complex iron systems more efficiently than direct first-principles calculations.

TabularInteratomic PotentialsGaussian Process RegressionMolecular DynamicsComputational MaterialsIron Simulations+1

0 views

NLP & Text

Frictional Rupture Experiments on Scale Dependence of Fracture Energy

Experimental data from stick-slip experiments performed in a bi-axial shear configuration to study the scale dependence of fracture energy in frictional rupture. The dataset likely contains measurements from Linear Elastic Fracture Mechanics and Cohesive Zone Model analyses, as well as near-fault stress-slip evolution data. The research was conducted by Federica Paglialunga at École Polytechnique Fédérale de Lausanne.

TabularEarthquake PhysicsExperimental GeophysicsFracture MechanicsFrictional Rupture+1

0 views

NLP & Text

Four Studies on Underestimating the Pleasure of Rediscovering Past Experiences

Four studies by Ting Zhang of Harvard University investigate how people underestimate the future interest and pleasure of rediscovering past experiences. The research uses a time-capsule paradigm to show individuals underestimate curiosity for rediscovered moments, particularly for ordinary events. This underestimation leads to time-inconsistent choices where people forgo documenting the present but later prefer rediscovery over other activities.

TabularBehavioral ResearchMemory StudiesTime CapsulePsychologyHuman Behavior+1

0 views

NLP & Text

Neurochemical Predictors of Cocaine Craving Vulnerability

Research data from Stanford University examining neurochemical and behavioral factors predicting vulnerability to cocaine craving and relapse. The dataset likely contains measures of dopaminergic function, including cerebrospinal fluid concentrations of HVA and DASO4, and motor activity recordings. The research was authored by Roy King of Stanford University.

TabularNeurochemistryCocaine AddictionBehavioral NeuroscienceDopamineClinical Research+1

0 views

NLP & Text

Chemically Induced Models of Pancreatitis: Cerulein-Induced Animal Models

Cerulein-induced pancreatitis, first described in 1977, is a widely used animal model for generating acute and chronic pancreatitis in mice and rats. The dataset, authored by Ke-You Zhang from Stanford University, describes the mechanisms of this model, including its action on CCK1 and CCK2 receptor subtypes. It is sourced from the paperswithcode platform under an Open Access (gold) license.

TextBiomedical DataCeruleinMedical ResearchPancreatitis ModelsAnimal Models+1

0 views

NLP & Text

Neurotoxicology Glossary with 800 Terms and Chemical Annexes

A glossary of about 800 primary terms related to neurotoxicology, basic and clinical neurology, and the effects of substances on the nervous system. It was authored by Douglas M. Templeton of the University of Toronto and includes annexes of common abbreviations and examples of chemicals with known neurotoxic effects. The primary objective is to provide clear definitions for non-specialists, such as chemists, who need to understand the neurotoxicology literature.

TextHealthcareMedical TerminologyGlossaryNeurotoxicologyNeuroscience+1

0 views

NLP & Text

Automation Level Effects on Situation Awareness and Functional Specificity

A. Gordon Smith's thesis from the University of Toronto investigates the relationships between performance, workload, and situation awareness at varying levels of automation. The work provides empirical evidence for the 'routine-failure trade-off' in automation and examines how task-specific functional structures affect selective reliance. Further validation of the findings is noted as required.

TabularAutomation ReliancePerformance WorkloadSituation AwarenessEmpirical StudyHuman Factors+1

0 views

NLP & Text

Twitter User and Topic Data for Advertising Campaign Analysis

Milad Eftekhar from the University of Toronto introduces a dataset from Twitter to evaluate algorithms for online advertising on micro-blogging platforms. The research focuses on identifying analogous topics with similar audiences and categorizing expert users for precise targeting. The dataset likely contains user and topic interaction data to support these advertising techniques.

TextSocial MediaMicro BloggingOnline AdvertisingTopic ModelingAudience Targeting+1

0 views

NLP & Text

Glossary of Neurotoxicology Terms with 800 Definitions and Chemical Annexes

A glossary of about 800 terms related to neurotoxicology, neurology, and the effects of chemicals on the nervous system. It was authored by Douglas M. Templeton of the University of Toronto to provide clear definitions for non-specialists. The glossary includes primary alphabetical entries and annexes with common abbreviations and examples of chemicals with known neurotoxic effects.

TextHealthcareToxicologyMedical TerminologyGlossaryNeurotoxicology+1

0 views

NLP & Text

Athlete Conversations on Attribution in Sport Psychology

A qualitative research dataset from the University of Toronto analyzes conversations with athletes to study attribution processes. The work critiques traditional questionnaire-based approaches and examines attributions as 'talk-in-action' across three conversational areas: questions about defeats, modest talk about victories, and the fleeting nature of attributions in dialogue. The dataset is associated with an open-access paper by Guy Faulkner.

TextAttribution TheoryConversation AnalysisDiscourse AnalysisSport PsychologyQualitative Research+1

0 views

NLP & Text

Neurotoxicology Glossary with 800 Terms and Chemical Annexes

A glossary of about 800 primary terms related to neurotoxicology, basic and clinical neurology. It was authored by Douglas M. Templeton of the University of Toronto to provide clear definitions for non-specialists. The glossary includes annexes of common abbreviations and examples of chemicals with known effects on the nervous system.

TextHealthcareMedical TerminologyGlossaryNeurotoxicologyNeuroscience+1

0 views

NLP & Text

Glossary of Neurotoxicology Terms with 800 Definitions and Chemical Annexes

Douglas M. Templeton from the University of Toronto authored a glossary of about 800 terms for neurotoxicology. The document provides clear definitions for non-specialists, especially chemists, and includes annexes of abbreviations and chemicals with known nervous system effects. Its primary objective is to facilitate the understanding of neurotoxicology literature for occupational and environmental risk assessment.

TextHealthcareToxicologyMedical TerminologyGlossaryNeurotoxicologyNeuroscience+1

0 views

NLP & Text

KEAP1 TEP: Protein Structures and Selectivity Panel for Drug Discovery

University of Oxford researchers present a Target Enabling Package for the KEAP1 protein, a key regulator of cellular stress response. The package includes the first crystal structure of a KEAP1-CUL3 complex and a structure of the apo-Kelch domain suitable for small molecule soaking. It also provides a selectivity assay panel of 17 human Kelch domain-containing proteins, showing high selectivity for KEAP1 inhibitors.

TabularKeap1 Nrf2Protein StructureDrug DiscoveryBiochemistryCrystallography+1

0 views

NLP & Text

KEAP1 TEP: Protein Structures and Selectivity Panel for Drug Discovery

TabularKeap1 Nrf2Protein StructureDrug DiscoveryBiochemistryCrystallography+1

0 views

NLP & Text

Trust and Confidence in Health Data Sharing with Commercial Companies

Mackenzie Graham from the University of Oxford presents a conceptual paper arguing for a shift from trust to confidence in systems for sharing health data with commercial companies. The text discusses the philosophical distinctions between trust, reliability, and confidence within the context of AI and healthcare partnerships. It outlines a proposed framework for a confidence-worthy data-sharing system.

TextAi EthicsTrust ConfidenceData SharingHealthcareHealth DataSynthetic+1

0 views

NLP & Text

Stimson Formation: Aeolian Bedform Record from Mastcam Images in Gale Crater, Mars

Steven G. Banham from Imperial College London authored a study analyzing Mastcam image data of the Stimson sandstone formation on Mars. The research decodes the sedimentary architecture to reconstruct ancient aeolian bedforms, such as draas and superimposed dunes, within a Hesperian desert landscape in Gale crater. The work contrasts this arid environment with earlier humid episodes inferred from lacustrine sediments.

ImageGeospatialComputer VisionSedimentary AnalysisAeolian ProcessesMarsPlanetary Geology+1

0 views

NLP & Text

Pre-built Sector-coupled Euro-Calliope Model: Subnational Energy System Data

2010 to 2018 input data underpins a pre-packaged, subnational-scale energy system model for Europe built using the Sector-Coupled Euro-Calliope workflow. The model, created by Bryn Pickering at ETH Zurich, includes sectors such as industry, transport, and heat, and is ready to be loaded into the Calliope framework. It supports various temporal resolutions and technology configurations for scenario analysis.

Tabular🇪🇺 EuropeEnergy SystemsEnergy ModelingSector CouplingCalliope+1

0 views

NLP & Text

NESP TWQ Project 2.1.2 - Scoping options for low-lying, marginal cane land to reduce DIN i

A 2016-2017 project exploring options to reduce nitrogen losses from marginal sugarcane land in priority wet tropics catchments. The project, associated with the Wet Tropics Water Quality Improvement Plan, involved mapping land, identifying alternative uses, and quantifying economic costs and benefits. It was contributed by the Australian Ocean Data Network and last updated on 2026-07-14.

GeospatialGreat Barrier ReefNitrogenWater QualityAgricultureLand UseFinance+1

0 views

NLP & Text

SBMP BAZ: Strategic Bushfire Abatement Zone for Canberra

A Bushfire Abatement Zone (BAZ) geospatial dataset developed under the Strategic Bushfire Management Plan. The BAZ surrounds Canberra and extends west towards the Murrumbidgee River, identifying rural areas where specific measures are required to reduce bushfire risk to life and property. The dataset is provided by the ACT Government Geospatial Data Catalogue (ACTmapi) and was last updated on 2026-06-27.

GeospatialZIPCSVSbmpGdc0b1360dfRisk assessmentLand Use PlanningBushfireEmergency ManagementAbatementBushfire ManagementBaz+1

0 views

NLP & Text

National Base Map WMTS: Topographic Mapping for Australia and Territories

A seamless topographic colour map service covers all of Australia, its outer islands, external territories, and the Australian Antarctic Territory. The service integrates data from Geoscience Australia, the Australian Antarctic Division, and OpenStreetMap, with specific datasets for Christmas Island and Cocos (Keeling) Islands. Vegetation data for the Australian continent is sourced from the Australian Collaborative Land Use and Management Program.

Geospatial🇦🇺 AustraliaXMLComputer VisionLand UseTopographic maps+1

0 views

PreviousPage 56 of 2473Next