DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Rex-Omni: A Multimodal Model for Visual Perception via Next-Token Prediction | DataSalon

Home Computer VisionRex-Omni: A Multimodal Model for Visual Perception via Next-Token Prediction

Computer Vision

Rex-Omni: A Multimodal Model for Visual Perception via Next-Token Prediction

Name: Rex-Omni: A Multimodal Model for Visual Perception via Next-Token Prediction
Creator: qq-2
Published: 2026-03-04T17:31:10
Keywords: Multimodal Ai, Computer Vision, Object Detection, Large Language Model, Multimodal

by qq-2·Updated 4mo ago

Available on 1 platform

Description

Rex-Omni is a 3-billion-parameter Multimodal Large Language Model that frames object detection and other visual perception tasks as a next-token prediction problem. The model was authored by qq-2 and its AWQ quantized version was released on October 31, 2025. The dataset page was last updated on March 4, 2026.

Use Cases

Fine-tuning a model for object detection based on the next-token prediction paradigm described.
Evaluating model performance on a wide range of visual perception tasks as mentioned in the description.
Deploying a quantized model version for inference based on the released AWQ variant.

Strengths

The model has 3 billion parameters, indicating a substantial architecture.
An AWQ quantized version is available, which reportedly saves 50% of storage space.
Fine-tuning code is available, as noted in the description.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
The specific dataset content, size, and format are unknown from the provided input.

Provenance

Source: huggingface
Freshness: Last updated 2026-03-04 17:38:34; freshness should be verified.

License is unknown.

Multimodal Multimodal Ai Computer Vision Object Detection Large Language Model

Related Datasets

Quality Score

C42

Description

Source

Reputation

Quality Score

C42

Description

Source

Reputation

Access

Community

833 downloads

1 likes

0 views

Dataset Info

Author: qq-2
Created: Mar 4, 2026
Updated: Mar 4, 2026
Last synced: Jun 7, 2026

Access

Community

833 downloads

1 likes

0 views

Dataset Info

Author: qq-2
Created: Mar 4, 2026
Updated: Mar 4, 2026
Last synced: Jun 7, 2026

Rex-Omni: A Multimodal Model for Visual Perception via Next-Token Prediction

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info