Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A collection of scripts authored by uv-scripts for performing GPU inference using the Transformers library's native continuous batching capability. It provides a method for efficient model serving without requiring dependencies on specialized inference engines like vLLM. The scripts are designed for Hugging Face Jobs and support any model architecture available in the Transformers library.
This resource is a set of executable scripts, not a tabular or structured dataset. Users must review the full description on the Hugging Face page for implementation details and prerequisites.