Remote sensing, total organic carbon, microcystin, integrated data mining and fusion technique, idfm, genetic programming, machine learning, data fusion, data mining


Monitoring water quality on a near-real-time basis to address water resources management and public health concerns in coupled natural systems and the built environment is by no means an easy task. Furthermore, this emerging societal challenge will continue to grow, due to the ever-increasing anthropogenic impacts upon surface waters. For example, urban growth and agricultural operations have led to an influx of nutrients into surface waters stimulating harmful algal bloom formation, and stormwater runoff from urban areas contributes to the accumulation of total organic carbon (TOC) in surface waters. TOC in surface waters is a known precursor of disinfection byproducts in drinking water treatment, and microcystin is a potent hepatotoxin produced by the bacteria Microcystis, which can form expansive algal blooms in eutrophied lakes. Due to the ecological impacts and human health hazards posed by TOC and microcystin, it is imperative that municipal decision makers and water treatment plant operators are equipped with a rapid and economical means to track and measure these substances. Remote sensing is an emergent solution for monitoring and measuring changes to the earth’s environment. This technology allows for large regions anywhere on the globe to be observed on a frequent basis. This study demonstrates the prototype of a near-real-time early warning system using Integrated Data Fusion and Mining (IDFM) techniques with the aid of both multispectral (Landsat and MODIS) and hyperspectral (MERIS) satellite sensors to determine spatiotemporal distributions of TOC and microcystin. Landsat satellite imageries have high spatial resolution, but such application suffers from a long overpass interval of 16 days. On the other hand, free coarse resolution sensors with daily revisit times, such as MODIS, are incapable of providing detailed water quality information because of low spatial resolution. This iv issue can be resolved by using data or sensor fusion techniques, an instrumental part of IDFM, in which the high spatial resolution of Landsat and the high temporal resolution of MODIS imageries are fused and analyzed by a suite of regression models to optimally produce synthetic images with both high spatial and temporal resolutions. The same techniques are applied to the hyperspectral sensor MERIS with the aid of the MODIS ocean color bands to generate fused images with enhanced spatial, temporal, and spectral properties. The performance of the data mining models derived using fused hyperspectral and fused multispectral data are quantified using four statistical indices. The second task compared traditional two-band models against more powerful data mining models for TOC and microcystin prediction. The use of IDFM is illustrated for monitoring microcystin concentrations in Lake Erie (large lake), and it is applied for TOC monitoring in Harsha Lake (small lake). Analysis confirmed that data mining methods excelled beyond two-band models at accurately estimating TOC and microcystin concentrations in lakes, and the more detailed spectral reflectance data offered by hyperspectral sensors produced a noticeable increase in accuracy for the retrieval of water quality parameters.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at

Graduation Date





Chang, Ni-bin


Master of Science in Environmental Engineering (M.S.Env.E.)


College of Engineering and Computer Science


Civil, Environmental, and Construction Engineering

Degree Program

Environmental Engineering








Release Date


Length of Campus-only Access

3 years

Access Status

Masters Thesis (Open Access)


Dissertations, Academic -- Engineering and Computer Science, Engineering and Computer Science -- Dissertations, Academic