Rex-Omni is a 3-billion-parameter Multimodal Large Language Model that frames object detection and other visual perception tasks as a next-token prediction problem. The model was authored by qq-2 and its AWQ quantized version was released on October 31, 2025. The dataset page was last updated on March 4, 2026.
Use Cases
- Fine-tuning a model for object detection based on the next-token prediction paradigm described.
- Evaluating model performance on a wide range of visual perception tasks as mentioned in the description.
- Deploying a quantized model version for inference based on the released AWQ variant.
Strengths
- The model has 3 billion parameters, indicating a substantial architecture.
- An AWQ quantized version is available, which reportedly saves 50% of storage space.
- Fine-tuning code is available, as noted in the description.
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- The specific dataset content, size, and format are unknown from the provided input.
Provenance
- Source
- huggingface
- Freshness
- Last updated 2026-03-04 17:38:34; freshness should be verified.