Methodologies for low-rank analysis and regionalization for multi-scale spatial datasets
Abstract
[EMBARGOED UNTIL 5/1/2024] This dissertation comprises three chapters that focus on developing low-rank modeling and spatial aggregation techniques to overcome the computational and storage challenges associated with analyzing spatial data. One such famous low-rank methodology is to use a basis expansion of the data to approximate the covariance matrices in a rank-deficient manner. In Chapter 2, we develop the REDS methodology that uses a combination of statistical and machine learning to build such basis functions and provides prediction and uncertainty quantification for large spatial datasets. Chapter 3 and Chapter 4 focus on spatial aggregation that also helps to minimize the size of the spatial data. It is well-known that such aggregation methods suffer from the pitfall of the ecological fallacy, which may lead to contradictory inferences between the original and aggregated data. Here we propose a loss function based on the Karhunen-Loeve expansion of the data to minimize the ecological fallacy. Chapter 3 incorporates this loss function within the framework of a minimum spanning tree to generate explicitly local partitions. Chapter 4 extends this methodology to the multivariate domain and illustrates examples of different modeling choices, clustering algorithms, and computation methods of the multivariate Karhunen-Lo`eve expansion.
Degree
Ph. D.