Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
141,875 datasets
185 sporadic pulmonary stenosis patients and 100 healthy controls were analyzed using whole-exome sequencing. The dataset contains results from gene-level burden tests and three machine learning algorithms, leading to the prioritization of 17 candidate genes. The data was published by Yuting Liu on figshare and last updated in June 2026.
185 sporadic pulmonary stenosis patients and 100 healthy controls were analyzed using whole-exome sequencing. Three machine learning algorithms—LASSO, random forest, and XGBoost—were applied to prioritize 17 candidate genes associated with the condition. The dataset, created by Yuting Liu and last updated in June 2026, provides a basis for investigating the genetic architecture of this congenital heart disease.
Yuting Liu published a dataset on 2026-06-04 containing prioritized candidate genes associated with pulmonary stenosis. The data was generated from whole-exome sequencing of 185 sporadic PS patients and 100 healthy controls, analyzed using gene-level burden tests and three machine learning algorithms. The final list includes 17 candidate genes prioritized through protein-protein interaction network analysis.
185 sporadic pulmonary stenosis patients and 100 healthy controls were analyzed via whole-exome sequencing to identify rare pathogenic variants. Three machine learning algorithms—LASSO, random forest, and XGBoost—were applied to prioritize 17 candidate genes associated with the condition. The dataset, shared by Yuting Liu under a CC-BY-4.0 license, provides a basis for investigating the genetic architecture of this congenital heart disease.
A genomic analysis dataset from a study of 185 sporadic pulmonary stenosis patients and 100 healthy controls. The data was generated by Yuting Liu and last updated in June 2026. It contains prioritized candidate genes identified through whole-exome sequencing, gene-level burden tests, and three machine learning algorithms.
185 sporadic pulmonary stenosis patients and 100 healthy controls were analyzed using whole-exome sequencing. The dataset contains results from gene-level burden tests and three machine learning algorithms (LASSO, RF, XGBoost) that prioritized 17 candidate genes. The data was published by Yuting Liu on figshare under a CC-BY-4.0 license and last updated on 2026-06-04.
A list of 17 candidate genes associated with pulmonary stenosis, prioritized from whole-exome sequencing data of 185 patients and 100 controls. The dataset was created by Yuting Liu and published on figshare in June 2026. It results from applying gene-level burden tests and three machine learning algorithms to identify rare pathogenic single nucleotide variants.
185 sporadic pulmonary stenosis patients and 100 healthy controls were analyzed using whole-exome sequencing and multiple machine learning algorithms. The dataset contains results prioritizing 17 candidate genes associated with the condition, published by Yuting Liu in June 2026. It is shared under a CC-BY-4.0 license on figshare as a 663.5 KB Excel file.
England's Integrated Care Boards (ICBs) are listed with official names and codes. The dataset contains two versions: one reflecting the structure as of April 2023 and another as of April 2026. Each record includes a 9-character code (ICBxxCD), a 3-character code (ICBxxCDH), and a name field (ICBxxNM) up to 77 characters long.
FERDI researchers constructed a novel fiscal effort index for 120 developing countries from 1990 to 2012, integrating UN-developed composite indices for economic vulnerability and human capital. Their analysis reveals that economic vulnerability negatively impacts tax revenues, while human capital improves them. This panel dataset provides a specific measure for evaluating tax performance while accounting for structural constraints.
Daily global surface reflectance data at 500-meter resolution, produced using a 16-day rolling window of VIIRS/NPP observations. The dataset applies the RossThick/Li-Sparse-Reciprocal BRDF model to correct view-angle effects, providing Nadir BRDF-Adjusted Reflectance for imagery bands I1, I2, and I3. It includes six science data layers for reflectance, albedo, and mandatory quality flags.
LBA-ECO CD-05 contains estimates of understory fuel loads (forest litter) at six locations near Paragominas in Northeastern Amazonia. Samples were collected from three forest conditions—primary, logged, and burned forest—with volumes and weights provided by size and condition class. Means and standard errors are reported from three transects per forest-condition class, based on the planar transect method.
Survey data from 33 evening postgraduate students and qualitative interviews from 19 participants at the Eastern Africa Statistical Training Centre (EASTC) in Tanzania. The study assessed perceived stress levels and academic performance, finding 75.8% of students experienced moderate stress. The dataset was created by Salum Nambwanga and last updated on 2026-05-17.
7,097 residents from the Shanghai colorectal cancer screening program were enrolled, with participants completing both a quantitative fecal immunochemical test (qFIT) and colonoscopy. The study, authored by Xiaocong Zhang and last updated in May 2026, analyzes the association between fecal hemoglobin concentration and the risk of advanced colorectal neoplasm. Logistic regression models were used to calculate odds ratios for risk across different f-Hb concentration thresholds.
Global daily satellite observations of formaldehyde (HCHO) total vertical column density, gridded at a 0.25x0.25 degree resolution. The OMHCHOG product bins pixel-level data from the OMI/Aura instrument without averaging, preserving raw data points for user-defined filtering. Scientists can use this Level-2G product to generate custom Level-3 global maps for atmospheric chemistry research.
0.25-degree gridded daily data provides global surface UVB irradiance and erythemal dose measurements from the Aura-OMI satellite. The Level-2G product bins, but does not average, swath pixel data into global grids, preserving all ancillary parameters like latitude, longitude, time, and solar angles for each observed scene. Scientists can apply custom filtering and averaging schemes to these binned 'candidate' pixels to create derived Level-3 products.
Avalúos catastrales vigencia 2019 provides cadastral land valuations per hectare for 2019, expressed in the 2019 legal minimum monthly wage (SMMLV) for areas conditioned and included in the land market. The dataset, from www.datos.gov.co, is intended to guide, formulate, monitor, and evaluate national and departmental public policies. It was last updated on 2026-05-18 18:25:21.
January 2002 edition of the Magnetic Anomaly Grid of the Australian Region, representing the first integrated onshore/offshore grid for the complete Australian margin. The grid contains 3,022,656 data points, has a cell size of 0.01 degree (approximately 1 km), and was created by combining levelled and unlevelled marine sectors with an onshore grid. The data were processed by the Australian Ocean Data Network, with marine data levelled in collaboration with Intrepid Geophysics.
A 2026 costing dataset from a cross-sectional mixed-methods study in Ethiopia's Sidama, Amhara, Oromia, and South Ethiopia regions. It compares financial and programmatic inputs for digital and paper-based contraceptive counseling tools for adolescent girls and young women, drawing data from 275 health posts and 55 health facilities. The dataset was authored by Meghan Cutherell and is shared under a CC-BY-4.0 license.
Mixed-methods data from 275 health posts and 55 health facilities across four Ethiopian regions compares digital and paper-based contraceptive counseling tools. The dataset includes quantitative client exit interviews, routine aggregate HMIS statistics, and qualitative key informant interviews with health workers. Meghan Cutherell published this 1.5 MB dataset on figshare under a CC-BY-4.0 license in May 2026.