International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 8, Issue 2 (March-April 2026) Submit your research before last 3 days of April to publish your research paper in the issue of March-April.

Comparative Evaluation of GPT-4o, Gemini, Llama, And Grok On Remote Sensing Imagery

Author(s) Mr. Abbdulmumini Imam Ibrahim, Mr. Abdullahi Muhammad Auwal, Mr. Jidda Harun Abba
Country India
Abstract This study presents an in-depth comparative evaluation of four Multimodal Large Language Models (MLLMs) GPT-4o, Gemini 2.5 Pro, Llama 4, and Grok 3 on satellite image captioning and classification using the Remote Sensing Image Captioning Dataset (RSICD). Using structured prompts and expert human judgment, we assessed each model across the following qualitative metrics: accuracy, relevance, understanding depth, and classification precision. Our findings show that MLLMs, while not replacements for specialized remote sensing tools, offer substantial support as analytical partners and produce context-aware interpretations and reliable classifications. Distinct performance profiles emerged, and we outlined critical directions for future research in quantitative benchmarking, advanced prompt engineering, and hybrid model architectures.
Keywords : Multimodal Large Language Models, Satellite Imagery, Remote Sensing, Image Captioning, Image Classification
Field Computer > Artificial Intelligence / Simulation / Virtual Reality
Published In Volume 7, Issue 6, November-December 2025
Published On 2025-12-31
DOI https://doi.org/10.36948/ijfmr.2025.v07i06.65292

Share this