Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,019 datasets
Text2CAD-Bench is a benchmark for evaluating text-to-CAD generation, comprising 600 human-curated examples organized into four levels. It was created by AICAD and released in February 2026. The benchmark is designed to assess performance across geometric complexity and application diversity.
Twenty-seven peer-reviewed articles were selected for an integrative review on purpose in life (PiL) and aging. The review, conducted by Cristina Cristóvão Ribeiro, analyzed data from large prospective longitudinal, cross-sectional, and experimental studies. It found robust associations between high PiL scores and lower risks of death, Alzheimer's disease, and other health conditions.
A dataset titled 'Reddit' hosted on HuggingFace, uploaded by PRISON-TARTURUS. The dataset was last updated on April 12, 2026. Its specific content, size, and structure are unknown from the provided metadata.
A dataset of customer reviews, likely containing textual feedback and associated metadata. It is published on the Kaggle platform. The specific source, collection method, and volume of data are not detailed in the available metadata.
A dataset on Kaggle containing user profile information reportedly from a 2021 Facebook data leak. The raw description indicates it likely contains first names, last names, and gender information. The dataset's origin, scale, and completeness are not detailed in the provided metadata.
Arnold Shankman's book analyzes Jewish immigrant entrepreneurship in New York and London between 1880 and 1914. It likely contains historical statistics, synagogue marriage records, and analyses of cultural assimilation and economic behavior. The work is structured into sections on enterprise and culture, Jewish mass migration, and entrepreneurship in Britain.
A review of the book 'Railroaded: The Transcontinentals and the Making of Modern America' by historian Richard White. The book analyzes the political and economic impact of transcontinental railroads in the Gilded Age. The dataset likely contains the text of this review.
Kaggle hosts a dataset titled 'review-chekpoints--2026-05-11--13250-13250'. The title suggests it likely contains evaluation data or metrics related to model checkpoints. The dataset's author, organization, and specific contents are unknown.
A dataset of customer reviews, likely containing textual feedback and associated ratings. It was published on Kaggle, but the specific source, size, and creation date are unknown. The columns suggest it is intended for analysis of customer opinions and satisfaction.
Urdu fake and true news dataset is a collection of news articles in the Urdu language labeled for authenticity. It likely contains text entries categorized as either fake or true news, sourced from Kaggle. The dataset's specific size, authorship, and update date are unknown.
A dataset of news articles in Turkish, sourced from Kaggle. The dataset's title suggests it contains text data for natural language processing. Specific details on volume, source, and time period are unavailable from the provided metadata.
Oregon's 2025 Consolidated Annual Performance Report (CAPER) evaluates the state's use of HUD formula and CARES Act funds for affordable housing and community development. The report covers activities from January 1, 2025, through December 31, 2025, and is the final report for the 2021-2025 Consolidated Plan period. It was published by the State of Oregon and last updated on March 8, 2026.
Over 1300 listings in Maryland, including some 200 historic districts, are recognized for their historical, architectural, or cultural significance. The U.S. National Park Service program, administered by the Maryland Historical Trust, includes properties ranging from prehistoric sites to recent buildings. Data is available in multiple formats including XML, JSON, RDF, and CSV.
An interactive Power BI dashboard provides analysis of Netflix's movie and TV show catalog. The dataset likely contains titles, genres, release years, and other metadata for content available on the streaming platform. The original source and specific data volume are not detailed in the provided description.
Shoppee reviews dataset published on Kaggle. The dataset likely contains user-generated text reviews from the Southeast Asian e-commerce platform Shopee. Specific details on volume, time range, and collection method are unavailable from the provided metadata.
A dataset concerning press freedom, likely containing metrics or rankings for different countries or regions. It was published on Kaggle in 2019, as indicated by the title's 'MM 2019' prefix. The specific source organization and data collection methodology are not provided in the available metadata.
A 2020 survey dataset from DataIQ, likely capturing organizational data assets and cultural metrics. It was published on Kaggle, but the specific sample size, author, and detailed methodology are unknown. The dataset appears to focus on business intelligence and data management practices.
Shoppe-review-aspect is a dataset hosted on Kaggle. The title suggests it contains customer reviews from an e-commerce platform, likely Shopee, with a focus on aspect-based analysis. The dataset's author, organization, and specific details such as size and license are unknown.
fb_reddit_ig is a dataset hosted on Kaggle. Its title suggests it contains data related to Facebook, Reddit, and Instagram. The dataset's specific content, size, and origin are not detailed in the available metadata.
Kaggle hosts a dataset titled 'Fake & True News', contributed by Yousra Chtouki. The dataset likely contains text articles or headlines labeled for veracity. The specific volume, source, and time period of the news articles are unknown from the provided metadata.