Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,022 datasets
A dataset for classifying news articles in the Bangla language, sourced from Kaggle. The specific volume of articles, labeling scheme, and creation details are not provided in the available metadata. Further inspection after download is required to confirm the data's scope and structure.
Image_compression_LIF_T=1 is a dataset hosted on Kaggle. Its title suggests it contains data related to image compression, likely involving compressed images or associated metrics. The dataset's specific content, size, and creation details are not provided in the available metadata.
Twitter Wrmmm520 2026.03.20 2034970421603471732 Hk5Qx Waiczbldjh Part1 is a dataset uploaded by author daaxila to the Hugging Face platform. The dataset was last updated on 2026-04-03 12:02:19. Its specific content and structure are unknown, but the title suggests it likely contains data related to Twitter activity.
A dataset of Twitter content uploaded by user 'daaxila' to Hugging Face. The title suggests it contains data related to media, likely from September 2025. The dataset was last updated on the platform in April 2026.
Twitter Wheresmymedia 2025.09.19 1968885375524393449 Wdq9Eyn8Ricufjy9 Part3 is a dataset of social media content uploaded to HuggingFace by daaxila. The dataset was last updated on April 3, 2026. Its specific content and scale are unknown from the provided metadata.
This repository contains 2D embeddings generated from four dimensionality reduction (DR) algorithms applied to four distinct cultural and humanities collections. Created by Giacomo Alliata for the Journal of Cultural Analytics in 2026, the data serves as a replication set for evaluating visualization techniques at scale. The original high-dimensional source data is withheld due to copyright restrictions, leaving only the coordinate outputs for analysis.
A personal dataset for understanding artistic preferences through Instagram usage. The data was sourced from the Kaggle platform, but specific details about its creation, size, and authorship are not provided. The description suggests the data is used to analyze personal taste in art via social media interactions.
Kaggle hosts a dataset for predicting viral TikTok trends through search pattern data. The description suggests it contains data related to user searches, likely used to forecast content popularity. Author, organization, and specific data details are not provided.
Over 9,000 records of metadata and popularity scores for trending film professionals. The dataset is sourced from Kaggle and likely contains tabular data on individuals in the cinema industry. The author, organization, and last update date are unknown.
tmdb_top_rated_movies is a dataset from Kaggle. It likely contains a list of films with high user or critic ratings sourced from The Movie Database (TMDB). The specific number of movies, included features, and time period covered are unknown from the provided metadata.
Kaggle hosts an English news classification dataset. The dataset likely contains news articles or headlines labeled with categories. Its specific size, origin, and creation date are unknown from the provided metadata.
Bangla news classification dataset published on Kaggle. The dataset likely contains news articles in the Bengali language with associated category labels. Specific details regarding the number of articles, source, and collection period are unavailable from the provided metadata.
British Geological Survey research project quantifies biofilm growth effects on sediment and rock hydraulic properties across microscopic (10^-10 m) to macroscopic (10^2 m) scales. The study also investigates biofilm influence on oxide precipitate formation and sorption behavior of trace impurities like Uranium, Technetium, and Strontium. Data is intended as input for industrial field-scale models of fluid and contaminant migration.
Scanned images of onshore Great Britain site investigation reports held in the British Geological Survey archives. Scanning began in 2002 and is ongoing, with the entire Edinburgh collection scanned and new reports from Keyworth added since 2002. Images are stored in TIFF format and indexed via the site investigation and borehole databases.
A 2026 project report from the British Geological Survey details research on Chemical Looping Combustion with oxygen uncoupling. The work includes materials testing at Cambridge, a scaled reactor test at Cranfield, and reactor modeling at Imperial College London. It contains text on flowsheet development, reactor design, and novel material performance.
Multi-anvil experiments were performed in static and deformation geometries on olivine polymorphs and Mid-Ocean Ridge Basalt-composition garnetite. Faults were induced by uniaxial compression, with garnetite showing evidence of thermal runaway to melting, suggesting a mechanism for deep earthquakes in subducted slabs.
British Geological Survey provides a text file containing input parameters for running ab initio molecular dynamics simulations of water at non-ambient conditions using the CP2K software. The data supports the computational methods described in a linked geochemistry paper on mineral-water reactions in Earth's mantle.
A systematic review dataset covering research on wearable biometric authentication from 2017 to 2025. The dataset likely contains structured summaries of academic literature, including methods, modalities, and performance metrics. It was sourced from the Kaggle platform.
Mobile Legends Play Store Reviews (2.5M Rows) is a dataset of 2.5 million unfiltered user reviews for the Mobile Legends: Bang Bang (MLBB) game. The raw description indicates the data is intended for NLP tasks, specifically tracking mentions of the 'Dark System' and toxicity. The dataset's author, organization, and last update date are unknown.
Customer reviews from Amazon and Google platforms, providing unified insights from e-commerce and location-based services. The dataset's author, organization, and specific size are unknown. The last update date is also unknown.