DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Llava Stvg Data: A Vision-Language Dataset for Spatio-Temporal Video Grounding | DataSalon

Home Multimodal & LLMLlava Stvg Data: A Vision-Language Dataset for Spatio-Temporal Video Grounding

Multimodal & LLM

Llava Stvg Data: A Vision-Language Dataset for Spatio-Temporal Video Grounding

Name: Llava Stvg Data: A Vision-Language Dataset for Spatio-Temporal Video Grounding
Creator: zaiquan
Published: 2025-12-04T20:18:54
Keywords: Vision Language, Multimodal Ai, Video Understanding, Multimodal

by zaiquan·Updated 7mo ago

Available on 1 platform

Description

Published on huggingface by author zaiquan and last updated on 2025-12-04. The dataset likely contains multimodal data for spatio-temporal video grounding tasks, which involve linking language queries to specific objects and time segments in videos. Its specific content, scale, and collection methodology require verification after download.

Use Cases

Train a model for spatio-temporal video grounding (inferred from domain, verify after download)
Benchmark video-language understanding systems (inferred from domain, verify after download)
Fine-tune large vision-language models on video-text alignment tasks (inferred from domain, verify after download)

Strengths

Published on the huggingface platform, a major hub for AI datasets and models.
Last updated on 2025-12-04, indicating recent maintenance.

Limitations

Metadata is minimal; actual content requires verification after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and license are unknown, which may limit suitability assessment.

Provenance

Source: huggingface
Freshness: Last updated 2025-12-04 20:29:37

License is unknown; users must verify terms of use before application.

Multimodal Vision Language Multimodal Ai Video Understanding

Related Datasets

Quality Score

D28

Description

Source

Reputation

Quality Score

D28

Description

Source

Reputation

Access

Community

877 downloads

2 likes

0 views

Dataset Info

Author: zaiquan
Created: Dec 4, 2025
Updated: Dec 4, 2025
Last synced: Apr 30, 2026

Access

Community

877 downloads

2 likes

0 views

Dataset Info

Author: zaiquan
Created: Dec 4, 2025
Updated: Dec 4, 2025
Last synced: Apr 30, 2026

Llava Stvg Data: A Vision-Language Dataset for Spatio-Temporal Video Grounding

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info