Outlier Detection & Reconstruction of Lost Big Earth Data Using Machine Learning

PhD Thesis


Adnan, M. 2025. Outlier Detection & Reconstruction of Lost Big Earth Data Using Machine Learning. PhD Thesis https://doi.org/10.48773/qywqy
AuthorsAdnan, M.
TypePhD Thesis
Qualification namePhD
Abstract

This dissertation thoroughly examines enhancing outlier detection and reconstruction techniques for Earth Observation (EO) datasets, specifically focusing on Land Surface Temperature (LST) values. Addressing both outlier detection and data reconstruction is crucial for EO and LST data analysis because undetected anomalies can distort temperature patterns, and incomplete data reduces the reliability of environmental assessments.

This research focuses on addressing important difficulties related to the collecting, processing, and analysis of LST data, which is in high demand for environmental monitoring and decision-making. In particular, the high variability of EO data, presence of noise and missing values, and the large volumes of satellite imagery pose significant challenges requiring robust and scalable methods.

This effort focuses on identifying and setting boundaries for the study area, particularly the Beijing-Tianjin-Hebei (BTH) region. A high-level research method is adopted, where image raster data are processed in ArcGIS and then transformed into tabular format suitable for machine learning, enabling systematic detection and correction of anomalies.

This thesis presents new techniques for improving the accuracy of temperature intensity representations and enabling effective statistical learning by processing image raster data in ArcGIS and converting them into a tabular format suitable for machine learning analysis. These techniques significantly reduce reconstruction errors, enhancing both data completeness and usability.

The use of self-supervised learning models, specifically the TabNet regressor, is a major advancement in improving the forecasting and rectification of anomalies in LST datasets. Empirical tests show a notable increase in anomaly detection precision and a reduction in data gaps, indicating a high level of success for these methods.

The study addresses problems related to the complexity of EO data and the model's adaptability to varied datasets and situations. Despite challenges, developing and verifying a unique tabular dataset for the study area has been crucial in establishing a standard for anomaly detection, thus improving the usefulness and reliability of LST data for environmental research and monitoring. By focusing on present contributions, this dissertation demonstrates how robust outlier detection and data reconstruction methods can effectively support environmental monitoring tasks.

The developed techniques have been tested in the EO data context, but will be applicable to other image-based data with similar underlying characteristics such as obscured areas. This immediate applicability underscores the real-world impact and relevance of the contributions within the scope of this thesis.

This dissertation makes a substantial contribution to the subject of big-earth data analysis by introducing creative methods for identifying outliers and reconstructing data. These methods enhance the quality and dependability of Land Surface Temperature datasets and serve as a validated solution for improving data integrity in current EO applications.

KeywordsLand surface temperature, Outlier Detection, LST, Earth Observation Data,
Year2025
PublisherCollege of Science and Engineering, University of Derby
Digital Object Identifier (DOI)https://doi.org/10.48773/qywqy
File
License
Publication process dates
Deposited10 Jul 2025
Permalink -

https://repository.derby.ac.uk/item/qywqy/outlier-detection-reconstruction-of-lost-big-earth-data-using-machine-learning

Download files


File
100503747Final.pdf
License: CC BY-NC-ND 4.0

  • 76
    total views
  • 43
    total downloads
  • 1
    views this month
  • 2
    downloads this month

Export as