An agricultural dataset containing meteorological and vegetation variables for population observations of Cimbex quadrimaculata in Turkey. The data was used to compare the performance of binary classification, multiclass classification, and regression models, with results published by Yunus Güral in 2026. The dataset is 5.5 KB in size and stored in an XLS file.
Use Cases
- Comparing regression model performance based on R², RMSE, and MAE metrics mentioned in the description
- Evaluating binary and multiclass classification accuracy based on accuracy, F1 score, and AUC metrics described
- Applying SHAP analysis for model interpretability based on temperature and humidity predictor influence
- Benchmarking ensemble methods like Gradient Boosting and XGBoost on complex ecological data structures
Strengths
- Dataset supports three distinct modeling approaches (binary classification, multiclass classification, regression) on the same data
- Model performance results are provided, including Gradient Boosting achieving 94.3% accuracy and 0.983 AUC
- Includes SHAP analysis results identifying temperature- and humidity-related variables as influential predictors
Limitations
- Row count is unknown, which may limit suitability assessment
- Column-level documentation is absent; field semantics must be inferred after download
- Dataset is very small at 5.5 KB, indicating limited scope
Provenance
- Source
- figshare
- Collection Method
- Population observations of Cimbex quadrimaculata
- Time Range
- 2020-2022
- Freshness
- Last updated 2026-04-03 17:49:02; freshness should be verified
- Geography
- Diyarbakır (Eğil) and Elazığ (Keban) provinces in Türkiye