Modeling Multivariate Spatio-Temporal Remote Sensing Data with Large Gaps
Qiang Lou, Zoran Obradovic
Prediction models for multivariate spatio-temporal functions in geosciences are typically developed using supervised learning from attributes collected by remote sensing instruments collocated with the outcome variable provided at sparsely located sites. In such collocated data there are often large temporal gaps due to missing attribute values at sites where outcome labels are available. Our objective is to develop more accurate spatio-temporal predictors by using enlarged collocated data obtained by imputing missing attributes at time and locations where outcome labels are available. The proposed method for large gaps estimation in space and time (called LarGEST) exploits temporal correlation of attributes, correlations among multiple attributes collected at the same time and space, and spatial correlations among attributes from multiple sites. LarGEST outperformed alternative methods in imputing up to 80% of randomly missing observations at a synthetic spatio-temporal signal and at a model of fluoride content in a water distribution system. LarGEST was also applied for imputing 80% of nonrandom missing values in data from one of the most challenging Earth science problems related to aerosol properties. Using such enlarged data a predictor of aerosol optical depth is developed that was much more accurate than predictors based on alternative imputation methods when tested rigorously over entire continental US in year 2005.