Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,012 datasets
DigiData consists of mobile interaction trajectories released by Meta Research in 2025 to train mobile control agents. The data follows a specialized collection protocol designed to systematically cover application features, though the total record count is not specified in the primary metadata.
A dataset named 'beer_reviews' sourced from the OpenML platform. No information is available regarding its contents, size, or structure.
A dataset named 'beer_reviews' sourced from the OpenML platform. No information is available regarding its contents, size, or structure.
A dataset named 'beer_reviews' sourced from the OpenML platform. No information is available regarding its contents, size, or structure.
Top Rated TMDB Movies Dataset was collected using the TMDB API and cleaned with Python. The dataset's author, organization, and specific size are not provided. Its last update date is unknown.
1,882 Indian film entries, likely focusing on the Malayalam-language cinema industry. The dataset covers a time range from 2015 to 2026, suggesting it includes both historical and projected future releases. It is published on the Kaggle platform, but the author, organization, and specific data collection method are unknown.
A dataset titled 'Expression' published on Kaggle. The specific content, size, and collection details are unknown from the provided metadata. Its nature must be inferred from the title, which suggests it may relate to forms of expression, such as emotional, artistic, or social communication.
Kaggle hosts a dataset of news headlines. The dataset's specific size, source, and time period are not detailed in the provided metadata. Its content likely contains textual data for analysis.
A historical and political analysis by H. Bruce Franklin of the POW/MIA issue in the United States following the Vietnam War. The work investigates the fabrication and evolution of the belief in live prisoners of war into a powerful cultural myth. It was authored by a scholar from the University of California, Irvine, and is sourced from the paperswithcode platform.
Land without Ghosts presents English translations of Chinese writings on America spanning 150 years. The collection includes extracts from 19th-century travel diaries and first-hand accounts from the 1930s. It was compiled by Edward J. M. Rhoads of The University of Texas at Austin.
A review of published information regarding the psychometric properties and utility of the English version Geriatric Depression Scale (GDS) over the past decade. The review, authored by Paul G. Stiles, assesses reliability and validity studies comparing the GDS to clinical diagnoses and other depression scales. It concludes the GDS is a valid screening tool for depression in the elderly and offers recommendations for its use.
Open-AgentRL GRPO 2K is a compact dataset containing approximately 2,000 samples for GRPO training. It was created by y-ohtani and last updated on February 28, 2026. The dataset is constructed by balanced sampling from five sources: DeepScaleR-Preview (374 math items), NuminaMath-1.5 (359 math items), Omni-MATH (366 math items), GPQA Diamond (198 science items), and LeetCodeDataset (351 code items).
A dataset titled 'Application Review' sourced from the Kaggle platform. The dataset's specific content, size, and origin are not detailed in the provided metadata. Its nature suggests it likely contains records related to the evaluation of applications, such as for jobs, loans, or programs.
Merged_bangla_fake_news_dataset.csv is a Kaggle-hosted collection likely containing text data for identifying misinformation in the Bengali language. The dataset's specific size, authorship, and creation date are unknown from the provided metadata. Its title suggests it is an aggregation from multiple sources, potentially useful for natural language processing tasks.
Bottom sediment and water samples were collected near Scott Base, Antarctica, to assess the effects of human effluents on the marine environment. The dataset includes samples from a grid pattern drilled through sea ice at Pram Point, with control sites at Cape Armitage and Turtle Rock. It was created by the organization SCIOPS and last updated in December 1994.
A dataset of Vietnamese news articles, published on Kaggle. The title suggests it contains historical or older news content. The author, organization, and specific collection details are unknown.
An audio dataset published on Kaggle. The title suggests it contains podcast recordings, but specific details like the number of files, recording length, and topics are unknown. The dataset's author, organization, and collection methodology are not provided in the available metadata.
A dataset of Reddit posts and comments related to mental health topics, sourced from the Kaggle platform. The specific volume, time range, and collection methodology are not detailed in the available metadata. Content likely contains user-generated text discussing various mental health conditions and experiences.
A collection of customer reviews for mobile phones from the Chinese e-commerce platform JD.com. The dataset is hosted on Kaggle, but its size, time range, and specific content are not detailed in the available metadata. The author, organization, and license information are also unknown.
News_Category_Dataset contains news articles published by HuffPost between September 12 and 23, 2022. The articles cover categories such as U.S. News, World News, Politics, and Entertainment. The dataset's author, organization, and exact size are unknown.