Keywords

Databases -- Quality control, Search and rescue operations

Abstract

There is a tremendous volume of data being generated in today’s world. As organizations around the globe realize the increased importance of their data as being a valuable asset in gaining a competitive edge in a fast-paced and a dynamic business world, more and more attention is being paid to the quality of the data. Advances in the fields of data mining, predictive modeling, text mining, web mining, business intelligence, health care analytics, etc. all depend on clean, accurate data. That one cannot effectively mine data, which is dirty, comes as no surprise. This research is an exploratory study of different domain data sets, addressing the data quality issues specific to each domain, identifying the challenges faced and arriving at techniques or methodologies for measuring and improving the data quality. The primary focus of the research is on the SAR or Search and Rescue dataset, identifying key issues related to data quality therein and developing an algorithm for improving the data quality. SAR missions which are routinely conducted all over the world show a trend of increasing mission costs. Retrospective studies of historic SAR data not only allow for a detailed analysis and understanding of SAR incidents and patterns, but also form the basis for generating probability maps, analytical data models, etc., which allow for an efficient use of valuable SAR resources and their distribution. One of the challenges with regards to the SAR dataset is that the collection process is not perfect. Often, the LKP or the Last Known Position is not known or cannot be arrived at. The goal is to fully or partially geocode the LKP for as many data points as possible, identify those data points where the LKP cannot be geocoded at all, and further highlight the underlying data quality issues. The SAR Algorithm has been developed, which makes use of partial or incomplete information, cleans and validates the data, and further extracts address information from relevant fields to successfully geocode the data. The algorithm improves the geocoding accuracy and has been validated by a set of approaches.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2011

Semester

Summer

Advisor

Hua, Kien

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Electrical Engineering and Computer Science

Format

application/pdf

Identifier

CFE0004050

URL

http://purl.fcla.edu/fcla/etd/CFE0004050

Language

English

Length of Campus-only Access

None

Access Status

Doctoral Dissertation (Open Access)

Subjects

Dissertations, Academic -- Engineering and Computer Science, Engineering and Computer Science -- Dissertations, Academic

Share

COinS