Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,980 datasets
SLM 100M training checkpoints and data are hosted on Kaggle. The raw description suggests the dataset is likely related to training a language model on news and historical text. The specific content, size, and structure require verification after download.
Only_movie_datasets is a dataset published on Kaggle. Its specific content, size, and origin are not detailed in the available metadata. The title suggests it likely contains information related to films.
Assessment records for cultural and creative products with a focus on red heritage design. The dataset is published on Kaggle, but the author, organization, and specific collection details are unknown. The raw description indicates it contains records related to heritage design assessment.
Trending movies over the years is a dataset hosted on Kaggle. The dataset likely contains information about movies and their popularity metrics across different years. Metadata is minimal; actual content requires verification after download.
137 countries are covered in an unbalanced panel spanning from 2010 to 2022. It combines data on welfare inequality, institutional quality, cultural characteristics, and macroeconomic controls from several international sources. The dataset was authored by James Adolphus and hosted by Harvard Dataverse.
North Pacific (NP) Index quantifies area-weighted sea level pressure over the region 30N-65N, 160E-140W. The dataset includes monthly, winter (November-March), and winter anomaly indices for analyzing decadal climate patterns. It is provided by the SCIOPS organization via the NCAR Climate Data Guide.
A dataset from figshare containing differentially expressed genes identified in a comparison between clone 2 and clone 3. The data, authored by Chenxuan Zang and last updated in March 2026, is provided as a 609 KB Excel file under a CC-BY-4.0 license. Its description suggests a focus on tumor heterogeneity and the tumor microenvironment.
Differentially expressed genes identified from a comparison of two tumor cell clones, clone 1 versus clone 3. The dataset was authored by Chenxuan Zang and is available as a 688.8 KB XLSX file under a CC-BY-4.0 license, last updated on March 17, 2026.
Differentially expressed genes identified from a comparison of two tumor cell clones. The dataset was authored by Chenxuan Zang and last updated on March 17, 2026. It is available as a 481.9 KB XLSX file under a CC-BY-4.0 license.
Chicago Department of Cultural Affairs & Special Events (DCASE) received Freedom of Information Act (FOIA) requests as of January 2023. The dataset is published by the City of Chicago on the Data.gov platform. The data was last updated on March 22, 2026.
Seattle's Department of Construction and Inspections provides data on the procedural steps taken during building plan reviews for construction permits. The dataset was last updated on March 22, 2026. It is available in multiple formats including CSV, JSON, RDF, and XML.
Seattle's Department of Construction and Inspections (SDCI) provides records of all code compliance complaints and violations currently under review and processing. The dataset is published by the City of Seattle on the datagov platform and was last updated on March 22, 2026. It is available in multiple formats including CSV, JSON, RDF, and XML.
Seven Quentin Tarantino films are covered, tracking every instance of the f-word and every on-screen death with their respective timestamps. The dataset likely contains a structured list of events for each film. The author, organization, and last update date are unknown.
Film dataset is a collection of data related to movies, published on Kaggle. The dataset's specific contents, size, and creation details are not described in the provided metadata. Its potential applications relate to the film and entertainment domain.
Permit application actions from Arlington County, Virginia, specifically for submissions that include plan review. The dataset was last updated on March 22, 2026, and is provided by the county government. It likely contains records of administrative decisions and status changes for construction and development permits.
Permit Application Comments is a dataset from Arlington County, Virginia, containing comments from plan reviewers on submitted permit applications that require plan review. The data is provided in JSON and ZIP formats and was last updated on March 22, 2026.
28.3 KB of data on the expression fold change of primitive streak, nascent mesoderm, and endoderm genes under different morphogen treatment combinations compared to a basal medium. The dataset was authored by Chenyang Ma and last updated on March 17, 2026. It is available in XLSX format under a CC-BY-4.0 license.
Product_review is a dataset hosted on Kaggle. Its specific content, size, and origin are not detailed in the available metadata. The dataset likely contains user-generated feedback on various products.
Aggregate counts per million for 5' UTR barcodes corresponding to cis-regulatory sequences in an MPRA library, demultiplexed by sample identity. The dataset was authored by Kyle Leix and last updated on March 17, 2026. It is a 41.6 KB XLSX file shared under a CC-BY-4.0 license.
A logistic regression analysis of factors associated with depressive symptoms, likely among patients with Hepatitis C Virus (HCV). The dataset, 5.5 KB in size, was authored by Sarah E. Kelly and last updated on March 17, 2026. It is available in XLS format under a CC-BY-4.0 license.