Cross-view image geo-localization aims to determine the locations of street-view query images by searching in a GPS-tagged reference image database from aerial view. One fundamental challenge is the dramatic view-point/domain difference between the street-view query images and aerial-view reference images. Recent works have made great progress on bridging the domain gap with advanced deep learning techniques and geometric prior knowledge, i.e. the query is aligned at the center of one aerial-view reference image (spatial alignment) and the orientation relationship between the two views is known (orientation alignment). However, such prior knowledge of the geometry correspondence of the two views is usually not available for real-world scenarios. In this dissertation, we first explore how current model would perform in real-world scenarios, where the spatial or orientation alignment is not available and geometric prior knowledge (e.g. polar transform) does not work well. For spatial alignment, we collect a new dataset with real-world protocol for this scenario and propose a better solution, as the first to explore multiple reference correspondence and GPS offset prediction beyond image-level retrieval. For orientation alignment, we demonstrate better metric learning techniques for this scenario and propose to estimate the orientation without explicit supervision. Then we propose a novel visual explanation method as well as the first quantitative analysis of visual explanation of deep metric learning to gain deeper understanding about the model with improved orientation estimation. Finally, we propose the first pure transformer-based method which does not rely on geometric prior knowledge (polar transform) and generalizes well on real-world scenarios w/o orientation or spatial alignment. We also provide quantitative measurement on computational cost to show that our model is more efficient than previous methods. In summary, we push cross-view image geo-localization toward real-world application with more realistic settings, higher accuracy, lower computational cost and better understanding/interpretation.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date





Chen, Chen


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science


Computer Science

Degree Program

Computer Science


CFE0009845; DP0027786





Release Date

June 2024

Length of Campus-only Access

1 year

Access Status

Doctoral Dissertation (Campus-only Access)

Restricted to the UCF community until June 2024; it will then be open access.