Synthetic E-Recruitment Candidate Profiles with Controlled Bias Injection
by Carvalho·Updated 1mo ago
1.1 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A synthetic dataset simulating candidate profiles for technology jobs in a Brazilian context, combining technical and demographic features. It includes nine partitions across three sizes (1k, 5k, 10k instances) and three bias conditions (debiased, biased, extreme bias). The dataset was created by Carvalho and last updated on May 11, 2026.
Use Cases
Auditing algorithmic fairness based on controlled demographic bias injection.
Benchmarking scalability of bias mitigation techniques based on dataset partitions of varying sizes.
Evaluating explainability (XAI) methods based on the relationship between technical features and suitability scores.
Testing multi-objective re-ranking algorithms based on candidate suitability scores and demographic attributes.
Strengths
Provides explicit parametric control over bias injection across gender, race, and location dimensions.
Offers nine dataset partitions combining three sizes (1k, 5k, 10k) and three bias conditions.
Includes a continuous target variable (suitability_score) derived deterministically from technical features in fair scenarios.
Limitations
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
Data is synthetic and may not reflect real-world distributions.
Provenance
Source
figshare
Collection Method
Controlled statistical generation pipeline and a FairGAN-based neural generator.
Freshness
Last updated 2026-05-11 06:18:10; freshness should be verified.
Geography
Brazil (uses location categories: Capital, Metropolitan Area, Interior as a geographic proxy).