Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,419 datasets
TPBench is a dataset for evaluating long-dialogue compression around lifecycle turning points. It was submitted as an artifact for the NeurIPS 2026 Evaluations and Datasets Track by the author '4papersubmission'. The dataset includes probe JSONL files, result aggregates, scorer/reader code, license disclosures, and Croissant metadata with Responsible AI fields.
Over 30 global and regional news syndicates provide the source material for this passive data warehouse. HEDA permanently stores the 1000+ word raw text extractions from these sources as part of Project VEDA. The dataset was created by ravikiranoffl and last updated on 2026-05-05.
Geoscience Australia produced a short video summarizing the value of its work to the management of Australia's marine jurisdictions. The video is part of a series of six films produced to communicate the agency's value to the nation. Further information about the agency's work in this area is available via a provided URL.
Hobsons Bay City Council in Australia provides a 2017 inventory of public playgrounds. The dataset lists each playground's park name, specific play area, suburb, and geographic coordinates (Easting and Northing). It was created and published by the Hobsons Bay City Council.
August 2017 records detail playgrounds managed by the Hobsons Bay City Council. The dataset includes park names, playground area designations, suburb locations, and Easting/Northing coordinates. It was created and published by the Hobsons Bay City Council.
August 2018 records of drinking fountains located within open spaces in Hobsons Bay City Council. The dataset includes Location, Suburb, Material, and X and Y coordinates. It was created and is maintained by Hobsons Bay City Council.
Hobsons Bay City Council provides a 2018 inventory of drinking fountains located within its open spaces. The dataset includes attributes such as Location, Suburb, Material, and X and Y coordinates. It was created and is maintained by the Hobsons Bay City Council.
Hobsons Bay City Council provides data on its internal political subdivisions and elected representatives. The dataset lists Ward Name, Ward Area in square kilometers, the number of Councillors per ward, and Councillors' names. This information was published by Hobsons Bay City Council and was last updated in April 2026.
Wards and Councillors in Hobsons Bay City Council. The dataset includes Ward Name, Ward Area in square kilometers, the number of Councillors per ward, and Councillor names. It is provided and maintained by the Hobsons Bay City Council, with a last recorded update in April 2026.
City of Ballarat provides polygon data detailing the construction years of properties within its jurisdiction up to 2005. The dataset is no longer maintained and was last updated in April 2026. It includes multiple geospatial file formats.
Polygon data contains the construction years of properties within the City of Ballarat, Australia. The dataset covers properties built up to the year 2005 and was created by the City of Ballarat. Data collection was discontinued after 2005.
A custom word list for KH Coder analysis of translation-related discourse in China's People's Daily newspaper. The data supports a study of topics and trends over a 74-year period from 1949 to 2023. It was contributed by Shi, Xinyu and hosted on the Harvard Dataverse platform.
August 2018 snapshot of seating infrastructure within public open spaces managed by Hobsons Bay City Council. The dataset includes location, suburb, seat type, material, and X and Y coordinates for each bench. It was created and published by the Hobsons Bay City Council.
Seats located within open spaces in Hobsons Bay City Council as of August 2018. The dataset includes attributes such as location, suburb, seat type, material, and X and Y coordinates. It was created and published by the Hobsons Bay City Council.
A literature review and spatial analysis of sedimentology and geomorphology for the Northwest Marine Region, as defined in 2007. Sedimentology data is based on consistent quantitative point assays of grainsize and carbonate content from the MARS database as of August 1, 2007. The dataset is provided by Geoscience Australia Data and was last updated on April 20, 2026.
City of Ballarat provides geospatial data on graffiti defects reported on municipal assets. Records span from July 2013 to May 2015. The dataset is published by the City of Ballarat council.
Graffiti defects recorded on assets managed by the City of Ballarat. The dataset covers incidents from July 2013 to May 2015. It was created and published by the City of Ballarat.
33,679 naturally occurring Reddit-based human experiences paired with self-disclosed emotion labels form the EXPRESS benchmark. Created by author bangzhao and detailed in a 2025 arXiv paper, this dataset uses emotions explicitly disclosed by the original authors as ground-truth labels. The dataset was last updated on Hugging Face in May 2026.
A multilingual news retrieval benchmark containing synthetic multihop queries across 10 languages. The dataset is sourced from recent news articles and is designed for evaluating dense retrieval and text embedding models on content post-dating typical model training cutoffs. It was created by jinaai and last updated on April 30, 2026.
Québec government press releases published on the open_canada platform. The dataset is licensed under CC-BY-4.0 and was last updated on 2026-04-17. The specific volume, time range, and content details require verification after download.