Loading...
Loading...
Traffic data, public transit, aviation, shipping, ride-hailing, accident records
8,913 datasets
2018 to 2024 fatal crashes involving trucks in the United States, derived from the Fatality Analysis Reporting System (FARS). The dataset likely contains records of incidents where a truck was involved and at least one fatality occurred. Its specific features and scale must be inferred from the FARS source.
The Wallops Flight Facility Lightning Mapping Array (WFFLMA) dataset collection is used to validate lightning detection instruments on satellites, including the ISS Lightning Imaging Sensor (LIS) and the Geostationary Lightning Mapper (GLM). Data files are available from February 1, 2025 through January 10, 2026 and are provided in ASCII format. The dataset is managed by the GHRC_DAAC and is actively being updated to backfill older data.
Observational data from 428 dyadic encounters on Seattle public transit platforms measures interpersonal distance as a behavioral indicator of racial avoidance. The study, by Joshua Corona for PS: Political Science & Politics, reveals asymmetrical avoidance patterns, such as East Asian and Hispanic individuals maintaining 18-19 additional feet from Black first-arrivers.
Calgary Transit provides annual service hour totals for its bus, CTrain, and community shuttle fleets. The dataset tracks paid operating hours, including both in-service and out-of-service time. Data is published by data.calgary.ca and was last updated in March 2026.
Kaggle hosts a dataset focused on maritime vessel trajectories, likely derived from Automatic Identification System (AIS) signals. The dataset appears designed for training or evaluating a trajectory foundation model. Its specific size, temporal coverage, and geographic scope require verification after download.
Daily bicycle and pedestrian counts are collected from sensors on the Peace Bridge, cycle tracks, and Stephen Avenue in Calgary. The dataset provides a breakdown of counts on a daily, weekly, monthly, and yearly basis. It is published by data.calgary.ca and was last updated in February 2026.
A compilation of U.S. traffic safety data released by the Bureau of Transportation Statistics. It includes 20 years (1975-1994) of Fatal Accident Reporting System (FARS) data and seven years (1988-1994) of General Estimates System data. The CD-ROM also contains summary information in the form of Traffic Safety Facts 1994 and 12 topical fact sheets from that year.
Dayton Aviation Heritage National Historical Park tract and boundary data consists of ESRI shape files created by the National Park Service's Land Resources Division. The data delineates properties owned by the NPS and areas where the NPS holds interests such as scenic easements or rights of way. The dataset was last updated on March 4, 2026.
1,037 world cities are analyzed for the relationship between rail transit, car use, and tree canopy cover from 2005 to 2020. The dataset includes attributes for car use per capita, rail transit presence, traffic expansion, and controls for income, density, and environmental policy. Benjamin Leffel created this replication data for a study on how public rail shields urban forests.
1,946 synthetically generated chat sessions simulate a four-turn interaction between an AI VTuber and multiple viewers. The dataset was created in March 2026 by DataPilot using the SDG-Nexus pipeline, blending AI VTuber personas with user personas from the nvidia/Nemotron-Personas-Japan model.
Data from 4 cycle lane recorders in Calderdale, showing counts of bicycles. It includes fields for the date and time of the count, site ID, count period, lane details, direction, and volume, with additional flag text for events like roadworks. The dataset's update status is under investigation as of June 2024.
Uber ride data is hosted on Kaggle. The dataset likely contains records of ride-hailing trips. Metadata is minimal; the specific content, volume, and features require verification after download.
The Extreme Ultraviolet Imager (EUVI) on NASA's STEREO mission captures solar images in four spectral channels. It provides a 2048 x 2048 pixel field of view out to 1.7 solar radii, observing emission lines from 30.4 nm to 21.1 nm. The instrument offers improved resolution and cadence over its predecessor, SOHO-EIT.
New York City trip record data from the Taxi and Limousine Commission for green taxis in December 2016. The dataset was transformed for a tabular regression benchmark, using 'tip_amount' as the target variable and focusing on credit card payment trips. It was downloaded in November 2018.
Trip record data from the New York City Taxi and Limousine Commission for green taxi trips in December 2016, used in a tabular data benchmark. The dataset was transformed for regression tasks, with 'tip_amount' selected as the target variable and trips filtered to credit card payments only. String datetime information was extracted to numeric columns, and certain fare-related variables were removed to increase the importance of categorical location IDs.
Trip Record Data provided by the New York City Taxi and Limousine Commission (TLC) includes TLC trips of the green line in December 2016. The dataset was downloaded on 03.11.2018 and transformed for a tabular data benchmark, with 'tip_amount' as the target variable. It includes only credit card payment trips and excludes certain fare-related columns to emphasize categorical location features.
Trip record data from the New York City Taxi and Limousine Commission for green taxi line trips in December 2016. The dataset was transformed for a tabular data benchmark, with 'tip_amount' as the target variable and a focus on credit card payments. Data was downloaded on November 3, 2018.
Trip record data for New York City green taxi line trips in December 2016, provided by the New York City Taxi and Limousine Commission. The dataset was transformed for a tabular regression benchmark, with 'tip_amount' as the target variable and a focus on categorical location features. It includes only trips paid by credit card to ensure tip data is present.
December 2016 trip records for New York City's green taxi line, originally provided by the NYC Taxi and Limousine Commission. The dataset was transformed for a tabular regression benchmark, with 'tip_amount' as the target variable and specific columns removed to alter feature importance. It includes only credit card payment trips, as tips are not reliably recorded for other payment types.
December 2016 trip records for New York City green taxis, provided by the NYC Taxi and Limousine Commission. The dataset was transformed for a tabular data benchmark, focusing on predicting tip amounts from credit card trips. It includes numeric features derived from datetime strings and location IDs, with certain financial columns removed to adjust feature importance.