Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
BeTraC 2026 is a synthetic dataset containing 7,600 doctor-patient dialog sessions with corresponding audio, transcripts, and SOAP note summaries. The dataset was created by BeTraC and released in April 2026. It is structured into training and development splits, with the training set comprising 7,200 dialogs.
Data is packaged in the WebDataset format (tar archives); users must be familiar with this format or use compatible libraries for access. The license is unspecified and should be verified before use.