LandGPT: Multi-modal model for parcel land use classification using multi-source data

Title: LandGPT: a multimodal large language model for parcel-level land use classification with multi-source data

landGPT_pict_1

Abstract

Actual land parcels vary significantly in size and complexity. Previous studies were limited by existing technical methods for fine-grained land use classification. The emergence of multimodal large language models offers new techniques for image classification, but their application in land use classification remains unexplored. This study presents LandGPT, a multimodal large language model trained on the CN-MSLU-100K dataset, covering fine-grained land use classification of irregular parcels. This study proposes a trans-level discrimination framework to improve LandGPT’s ability to classify fine-grained land use. Under this framework, LandGPT achieves a discrimination accuracy of 89.7% and a Kappa coefficient of 0.85 for fine-grained land use categories, outperforming state-of-the-art models with a 48.33% accuracy improvement. In some challenging categories, the improvement reaches nearly 1500%. This study finds that training with multi-source remote sensing image data improved LandGPT’s accuracy by 15.79% compared to single-image data. This study explores Prompt engineering based on LandGPT. The optimal prompt paradigm offers fine-grained categories and guides the model for accurate classification, reducing errors from LLM hallucinations. This study pioneeringly explores the application of large language models in the land use domain and offers a new solution for fine-grained land use classification.