by Durbin, Philip / Open Source at Harvard·Updated 10d ago
Available on 1 platform
Sign in to view source links and access this dataset
Description
A tabular file contains information on known Harvard repositories on GitHub. It includes metrics such as the number of stars, programming language, day last updated, number of open issues, size, number of forks, repository URL, create date, and description. The dataset was created by Philip Durbin and last updated on June 25, 2026.
Use Cases
Predict repository popularity based on features like star count and fork count.
Analyze programming language trends within an academic institution's open-source projects.
Study the relationship between repository activity (open issues, last updated) and other metrics.
Strengths
Includes multiple repository metrics such as stars, forks, open issues, and programming language.
Associated JSON files provide raw data retrieved via the GitHub API.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
GitHub API, via the IQSS/open-source-at-harvard repository.
Collection Method
Data retrieved using code from the linked GitHub repository.
Freshness
Last updated 2026-06-25 02:46:13
License is unknown and should be verified before use.