Advanced Search
Turn off MathJax
Article Contents

doi: 10.11999/JEIT250818 cstr: 32379.14.JEIT250818
Funds:  Science and Disruptive Technology Program, AIRCAS (2025-AIRCAS-SDTP-04)
  • Accepted Date: 2026-01-12
  • Rev Recd Date: 2026-01-12
  • Available Online: 2026-01-27
  • loading
  • [1]
    YAN Qiwei, DENG Chubo, LIU Chenglong, et al. ReCon1M: A large-scale benchmark dataset for relation comprehension in remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 4507022. doi: 10.1109/TGRS.2025.3589986.
    [2]
    LEE J, MIYANISHI T, KURITA S, et al. CityNav: A large-scale dataset for real-world aerial navigation[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, Honolulu, The United States of America, 2025: 5912–5922. (查阅网上资料, 未找到本条文献信息, 请确认).
    [3]
    LIU Chenyang, ZHAO Rui, CHEN Hao, et al. Remote sensing image change captioning with dual-branch transformers: A new method and a large scale dataset[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5633520. doi: 10.1109/TGRS.2022.3218921.
    [4]
    XIA Guisong, BAI Xiang, DING Jian, et al. DOTA: A large-scale dataset for object detection in aerial images[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 3974–3983. doi: 10.1109/cvpr.2018.00418.
    [5]
    LI Ke, WAN Gang, CHENG Gong, et al. Object detection in optical remote sensing images: A survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296–307. doi: 10.1016/j.isprsjprs.2019.11.023.
    [6]
    LI Yuxuan, LI Xiang, LI Weijie, et al. SARDet-100K: Towards open-source benchmark and toolkit for large-scale SAR object detection[C]. Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2024: 4079.
    [7]
    SUO Jiashun, WANG Tianyi, ZHANG Xingzhou, et al. HIT-UAV: A high-altitude infrared thermal dataset for Unmanned Aerial Vehicle-based object detection[J]. Scientific Data, 2023, 10(1): 227. doi: 10.1038/s41597-023-02066-6.
    [8]
    李春柳, 王水根. 红外海上船舶数据集[EB/OL]. https://openai.raytrontek.com/apply/Sea_shipping.html, 2021. (查阅网上资料,未找到对应的英文翻译信息,请确认补充).
    [9]
    刘晴, 徐召飞, 金荣璐, 等. 红外安防数据库[EB/OL]. https://openai.raytrontek.com/apply/Infrared_security.html, 2021. (查阅网上资料,未找到对应的英文翻译信息,请确认补充).
    [10]
    刘晴, 徐召飞, 王水根. 红外航拍人车检测数据集[EB/OL]. http://openai.raytrontek.com/apply/Aerial_mancar.html, 2021. (查阅网上资料,未找到对应的英文翻译信息,请确认补充).
    [11]
    李钢强, 王建生, 王水根. 双光车载场景数据库[EB/OL]. http://openai.raytrontek.com/apply/Double_light_vehicle.html, 2021. (查阅网上资料,未找到对应的英文翻译信息,请确认补充).
    [12]
    李永富, 赵显, 刘兆军, 等. 远海(10–12km)船舶的目标检测数据集[EB/OL]. http://www.core.sdu.edu.cn/info/1133/2174.htm, 2020.(查阅网上资料,未找到对应的英文翻译信息,请确认补充).
    [13]
    XIA Guisong, HU Jingwen, HU Fan, et al. AID: A benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965–3981. doi: 10.1109/TGRS.2017.2685945.
    [14]
    CHENG Gong, HAN Junwei, and LU Xiaoqiang. Remote sensing image scene classification: Benchmark and state of the art[J]. Proceedings of the IEEE, 2017, 105(10): 1865–1883. doi: 10.1109/JPROC.2017.2675998.
    [15]
    GRATTAFIORI A, DUBEY A, JAUHRI A, et al. The llama 3 herd of models[J]. arXiv: 2407.21783, 2024. doi: 10.48550/arXiv.2407.21783.(查阅网上资料,不确定文献类型及格式是否正确,请确认).
    [16]
    LOBRY S, MARCOS D, MURRAY J, et al. RSVQA: Visual question answering for remote sensing data[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(12): 8555–8566. doi: 10.1109/TGRS.2020.2988782.
    [17]
    CHEN Jun, ZHU Deyao, SHEN Xiaoqian, et al. MiniGPT-v2: Large language model as a unified interface for vision-language multi-task learning[J]. arXiv: 2310.09478, 2023. doi: 10.48550/arXiv.2310.09478.(查阅网上资料,不确定文献类型及格式是否正确,请确认).
    [18]
    WU Zhiyu, CHEN Xiaokang, PAN Zizheng, et al. Deepseek-vl2: Mixture-of-experts vision-language models for advanced multimodal understanding[J]. arXiv: 2412.10302, 2024. doi: 10.48550/arXiv.2412.10302.(查阅网上资料,不确定文献类型及格式是否正确,请确认).
    [19]
    PENG Zhiliang, WANG Wenhui, DONG Li, et al. Kosmos-2: Grounding multimodal large language models to the world[J]. arXiv: 2306.14824, 2023. doi: 10.48550/arXiv.2306.14824.(查阅网上资料,不确定文献类型及格式是否正确,请确认).
    [20]
    CHEN Keqin, ZHANG Zhao, ZENG Weili, et al. Shikra: Unleashing multimodal LLM's referential dialogue magic[J]. arXiv: 2306.15195, 2023. doi: 10.48550/arXiv.2306.15195.(查阅网上资料,不确定文献类型及格式是否正确,请确认).
    [21]
    HU Huiyang, WANG Peijin, FENG Yingchao, et al. RingMo-Agent: A unified remote sensing foundation model for multi-platform and multi-modal reasoning[J]. arXiv: 2507.20776, 2025. doi: 10.48550/arXiv.2507.20776.(查阅网上资料,不确定文献类型及格式是否正确,请确认).
    [22]
    ANDERSON P, WU Q, TENEY D, et al. Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 3674–3683. doi: 10.1109/CVPR.2018.00387.
    [23]
    LIU Shuobo, ZHANG Hongsheng, QI Yuankai, et al. AeriaLVLN: Vision-and-language navigation for UAVs[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2023: 15338–15348. doi: 10.1109/ICCV51070.2023.01411.
    [24]
    WANG Peijin, HU Huiyang, TONG Boyuan, et al. RingMoGPT: A unified remote sensing foundation model for vision, language, and grounded tasks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5611320. doi: 10.1109/TGRS.2024.3510833.
    [25]
    CHENG Qimin, HUANG Haiyan, XU Yuan, et al. NWPU-captions dataset and MLCA-net for remote sensing image captioning[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5629419. doi: 10.1109/TGRS.2022.3201474.
    [26]
    LU Xiaoqiang, WANG Binqiang, ZHENG Xiangtao, et al. Exploring models and data for remote sensing image caption generation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(4): 2183–2195. doi: 10.1109/TGRS.2017.2776321.
    [27]
    QU Bo, LI Xuelong, TAO Dacheng, et al. Deep semantic understanding of high resolution remote sensing image[C]. 2016 International Conference on Computer, Information and Telecommunication Systems (CITS), Kunming, China, 2016: 1–5. doi: 10.1109/CITS.2016.7546397.
    [28]
    LI Kun, VOSSELMAN G, and YANG M Y. HRVQA: A visual question answering benchmark for high-resolution aerial images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2024, 214: 65–81. doi: 10.1016/j.isprsjprs.2024.06.002.
    [29]
    SUN Xian, WANG Peijin, LU Wanxuan, et al. RingMo: A remote sensing foundation model with masked image modeling[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5612822. doi: 10.1109/TGRS.2022.3194732.
    [30]
    HU Huiyang, WANG Peijin, BI Hanbo, et al. RS-vHeat: Heat conduction guided efficient remote sensing foundation model[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, Honolulu, The United States of America, 2025: 9876–9887. (查阅网上资料, 未找到本条文献信息, 请确认).
    [31]
    CHANG Hao, WANG Peijin, DIAO Wenhui, et al. Remote sensing change detection with bitemporal and differential feature interactive perception[J]. IEEE Transactions on Image Processing, 2024, 33: 4543–4555. doi: 10.1109/TIP.2024.3424335.
    [32]
    LIU Haotian, LI Chunyuan, LI Yuheng, et al. Improved baselines with visual instruction tuning[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 26286–26296. doi: 10.1109/CVPR52733.2024.02484.
    [33]
    HU Yuan, YUAN Jianlong, WEN Congcong, et al. RSGPT: A remote sensing vision language model and benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2025, 224: 272–286. doi: 10.1016/j.isprsjprs.2025.03.028.
    [34]
    LUO Junwei, PANG Zhen, ZHANG Yongjun, et al. SkySenseGPT: A fine-grained instruction tuning dataset and model for remote sensing vision-language understanding[J]. arXiv: 2406.10100, 2024. doi: 10.48550/arXiv.2406.10100.
    [35]
    LI Xiang, DING Jian, and MOHAMED E. VRSBench: A versatile vision-language benchmark dataset for remote sensing image understanding[C]. Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2024: 106.
    [36]
    SHI Qian, HE Da, LIU Zhengyu, et al. Globe230k: A benchmark dense-pixel annotation dataset for global land cover mapping[J]. Journal of Remote Sensing, 2023, 3: 0078. doi: 10.34133/remotesensing.0078.
    [37]
    HU Fengming, XU Feng, WANG R, et al. Conceptual study and performance analysis of tandem multi-antenna spaceborne SAR interferometry[J]. Journal of Remote Sensing, 2024, 4: 0137. doi: 10.34133/remotesensing.0137.
    [38]
    MEI Shaohui, LIAN Jiawei, WANG Xiaofei, et al. A comprehensive study on the robustness of deep learning-based image classification and object detection in remote sensing: Surveying and benchmarking[J]. Journal of Remote Sensing, 2024, 4: 0219. doi: 10.34133/remotesensing.0219.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(15)

    Article Metrics

    Article views (265) PDF downloads(53) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return