Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,908 datasets
A preview dataset contains 50 sample entries from a larger release of 2 million records designed for PII masking tasks. It was created by AI4Privacy and was last updated in April 2026. The full dataset focuses on European personal work and HR information.
A preview of 50 sample entries from the PII-Masking-2M European release focused on Personal Location and Travel Information (PLI). The dataset is produced by AI4Privacy, with the preview published in April 2026. Full access to the complete 200,000-entry dataset requires contacting the authors.
European Personal Location and Travel Information (PLI) data with personally identifiable information redacted. This preview contains 50 sample entries from the larger PII-Masking-2M release by AI4Privacy. The dataset was last updated in April 2026.
50 sample entries from the PII-Masking-2M European release by AI4Privacy. The preview contains redacted source text and PII values, with the full dataset containing 200k entries. The dataset was last updated in April 2026.
50 sample entries from the PII-Masking-2M European release provide a preview of financial documents with personally identifiable information redacted. The dataset is created by AI4Privacy and was last updated in April 2026. The full dataset contains 200,000 entries.
A training dataset for an autonomous AI video editor and social media content strategist. It contains synthetic video editing trajectories and real viral TikTok content converted to an agent training format. The dataset was authored by ryu34 and last updated on Hugging Face in April 2026.
A preview of 50 sample entries from the PII-Masking-2M dataset for European languages, created by AI4Privacy. The dataset contains text with redacted personal identifiable information. The full dataset was referenced in 2026.
50 sample entries provide a preview of a larger 200,000-record dataset for PII masking tasks. The dataset is created by AI4Privacy, with a preview last updated in April 2026. It focuses on European personal digital information.
Geoscience Australia provides spatially continuous raster grids of seabed gravel, mud, and sand content for the Browse region within the Australian continental EEZ. The data, expressed as weight percentages from 0 to 100%, is presented at a 0.0025 decimal degree resolution. This 2014 dataset supersedes previous predictions with improved accuracy for basin-scale analysis.
Seabed sand content data for the Australian continental Exclusive Economic Zone is provided as a spatially continuous raster at 0.01 decimal degree resolution. The dataset, produced by Geoscience Australia, supersedes previous predictions with improved accuracy for national and regional scale analysis.
Vlaming sub-basin in the Australian continental EEZ contains spatially continuous predictions of seabed gravel, mud, and sand content. The data is expressed as weight percentages in 0.0025 decimal degree resolution raster grids. It supersedes previous predictions with improved accuracy, though artifacts exist in areas with insufficient samples.
Spatially continuous raster data of seabed mud content, expressed as a weight percentage from 0 to 100%. The dataset covers the Australian continental Exclusive Economic Zone at a 0.01 decimal degree resolution, superseding previous predictions with improved accuracy.
Seabed gravel content data provides a spatially continuous raster of sediment fraction weight percentages from 0 to 100% across the Australian continental Exclusive Economic Zone. The dataset, created by Geoscience Australia, supersedes previous predictions with improved accuracy and is intended for national and regional scale analysis.
Spatially continuous raster grids at 0.0025 decimal degree resolution detail the weight percentage of seabed gravel, mud, and sand in the Petrel sub-basin. The dataset supersedes previous predictions with improved accuracy, though artifacts occur in regions with insufficient sample density. It is intended for basin-scale analysis of sediment distribution.
Shortnews is a collection of five Ukrainian news articles from Ukrainska Pravda. The dataset was created by author a-l-o and was last updated on the platform on 2026-04-29. It was used as source material for a study measuring Ukrainian text entropy using Shannon's Guessing Game.
Supplementary material from a systematic review and meta-analysis comparing rituximab and cyclophosphamide with steroids for primary membranous nephropathy. The data file is 26.4 KB in size and was published on figshare by Fangjiao Huang under a CC-BY-4.0 license. The record was last updated on 2026-04-24.
A 13.8 KB Excel file contains detailed experimental results from the NeuroAbs research group, supporting a 2026 manuscript on LLM-guided hardware verification. It includes the data used to generate Figure 7 and Table 4 in the associated paper. The dataset was published on the figshare platform in April 2026.
Experimental results accompany the NeuroAbs research manuscript on accelerating hardware property checking. The Excel file contains detailed data visualized in Figure 7 and Table 4 of the paper. The NeuroAbs group created the dataset, which was last updated in April 2026.
Restricted-access data from a study on near-peer mentoring in a studio-based higher education context. The package contains de-identified questionnaire responses and mentor reflective reports, authored by DİLEK YASAR and last updated in March 2026. Materials are currently intended for editorial and peer review purposes only.
A 2026-04-24 updated staging package of skill-card corpora for the 'Thinking with Reasoning Skills' project. The data is associated with a paper accepted to ACL 2026's Industry Track and is hosted by author stallone0000. It includes compact skill-card corpora and manifest entries for larger local sources.