RSICD: Remote Sensing Image Captioning Dataset

Name: RSICD: Remote Sensing Image Captioning Dataset
Creator: 201528014227051
Published: 2017-08-31T08:48:05

by 201528014227051Updated 4y ago

Description

10,921 high-resolution remote sensing images collected from satellite imagery sources, each paired with 5 descriptive natural language captions. The dataset covers 30 distinct scene categories, including airports, bridges, and residential areas, totaling approximately 54,605 caption-image pairs.

Use Cases

Train image captioning models to generate descriptive text based on visual features in remote sensing imagery
Develop cross-modal retrieval systems to identify specific satellite images using natural language queries
Fine-tune vision-language models for scene classification across the 30 provided land-use categories
Benchmark automated text-to-image synthesis for geographic and environmental monitoring contexts

Strengths

10,921 remote sensing images sourced from Google Earth, Baidu Map, MapABC, and Tianditu
54,605 natural language captions providing 5 unique descriptions per image
30 distinct scene categories including 'airport', 'playground', 'viaduct', and 'beach'
Standardized image dimensions of 224x224 pixels for consistent model training

Related Datasets

Quality Score

D18

Description

17

Source

19

Reputation

17

Access

22

Community

228 likes

0 views

Dataset Info

Author: 201528014227051
Created: Aug 31, 2017
Updated: Nov 28, 2021
Last synced: May 19, 2026

Access

22

Community

228 likes

0 views

Dataset Info

Author: 201528014227051
Created: Aug 31, 2017
Updated: Nov 28, 2021
Last synced: May 19, 2026

RSICD: Remote Sensing Image Captioning Dataset

Description

Use Cases

Strengths

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info