Sign in to view source links and access this dataset
Description
393 raw data files totaling 0.95 GiB are stored in the origin directory, with an additional 52 estimation files totaling 2.56 GiB. The dataset, hosted by MLL-Lab, provides a manifest for browsing these artifacts. Its primary table is manifest.jsonl, a file index for the uploaded data.
Use Cases
Browse a file repository based on the manifest.jsonl index mentioned in the description.
Analyze file type distribution based on the counts of .json, .log, and .yaml files.
Manage data artifact storage based on the distinction between 'warehouse' and 'newwarehouse' source directories.
Strengths
Contains 393 files in the origin directory, providing a substantial collection of raw artifacts.
Includes 52 estimation files totaling 2.56 GiB, offering a separate, larger data component.
File types are explicitly listed: 346 JSON, 33 log, and 14 YAML files in the origin set.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
MLL-Lab
Collection Method
Uploaded from two source directories; files starting with 'warehouse' were removed.
Freshness
Last updated 2026-05-28 03:39:26; freshness should be verified.
License is unknown; users should verify terms before use.