High spatial resolution (HSR) imagery scene classification has been the subject of increased interest in recent years, and has great potential for many applications, such as urban functional analysis. Rooted in natural information processing, the use of the probabilistic topic model (PTM) to capture latent topics to represent HSR images has been an effective way to bridge the semantic gap. However, how to effectively discover discriminative information to recognize the HSR scenes is a challenging task. In this paper, the sparse homogeneous-heterogeneous topic feature model (SHHTFM) is proposed for HSR image scene classification. Differing from the conventional PTM-based scene classification methods, which utilize only heterogeneous features, SHHTFM explores the effect of the homogeneous information. Based on the union of uniform grid sampling and simple linear iterative clustering superpixel sampling, SHHTFM exploits both the heterogeneous and homogeneous information. After separately mining different types of low-level features and latent topics, the sparse topic inference procedure of SHHTFM further improves the fusion of the sparse heterogeneous and homogeneous topics. In addition, multisource geographical data are effectively integrated, where the water and vegetation boundaries define a more accurate way to restrict the boundaries of different scenes, and are then combined with the road network data to further improve the scene annotation performance. This provides more reliable and applicable results for us to better understand the complex scenes. The experimental results obtained with two HSR image classification data sets and an HSR image annotation data set demonstrate that the proposed SHHTFM framework can solve the scene classification problem, with a high classification accuracy as well as a high time efficiency