Soybean Yield and Composition Predictions with Weather and Genotype Data, 30-Year Coverage
by Timilehin T. Ayanlade·Updated 2mo ago
405.2 KB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A Transformer-based deep learning framework built on 30 years of multi-environment performance data from the Northern and Southern Uniform Soybean Tests (UST) across North America. The dataset integrates multivariate time-series weather data with genotypic information, maturity group, and geographic location to predict seed yield, oil, and protein content. The model achieved predictive accuracies (R2) of 77.6 ± 0.2%, 63.9 ± 4.7%, and 79.3 ± 2.3% for seed yield, oil, and protein, respectively.
Use Cases
Predict in-season soybean seed yield based on integrated weather, genotype, and management factors.
Forecast seed oil and protein composition to guide breeding program selections.
Interpret model predictions to identify key weather predictors like solar radiation and temperature.
Analyze spatio-temporal relationships of variables affecting soybean performance across diverse environments.
Strengths
Integrates 30 years of multi-environment performance data from cooperative breeding programs.
Model demonstrates high predictive accuracy, with an R2 of 77.6% ± 0.2% for seed yield.
Includes multiple data modalities: weather time-series, genotype, management factors, and geographic location.
Employs interpretability methods to assess feature importance for critical growing timepoints.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The primary file is a 405.2 KB PDF, suggesting the dataset itself may be small or the data is embedded within a research paper.
Provenance
Source
Data from the Northern and Southern Uniform Soybean Tests (UST) across North America.
Collection Method
Multi-environment performance data collected over 30 years, integrated with weather, genotype, and management factors.
Time Range
30 years of data (specific start and end years not provided).
Freshness
Last updated 2026-03-18 07:45:22; freshness should be verified.
Geography
North America, covering environments in the Northern and Southern UST regions.
Data is provided as a PDF file (405.2 KB), which may require extraction or may primarily contain a research paper description rather than raw data tables.