Keywords

FPGA FAULT TOLERANCE RECONFIGURATION GENETIC ALGORITHMS

Abstract

In this dissertation, a novel self-repair approach based on Consensus Based Evaluation (CBE) for autonomous repair of SRAM-based Field Programmable Gate Arrays (FPGAs) is developed, evaluated, and refined. An initial population of functionally identical (same input-output behavior), yet physically distinct (alternative design or place-and-route realization) FPGA configurations is produced at design time. During run-time, the CBE approach ranks these alternative configurations after evaluating their discrepancy relative to the consensus formed by the population. Through runtime competition, faults in the logical resources become occluded from the visibility of subsequent FPGA operations. Meanwhile, offspring formed through crossover and mutation of faulty and viable configurations are selected at a controlled re-introduction rate for evaluation and refurbishment. Refurbishments are evolved in-situ, with online real-time input-based performance evaluation, enhancing system availability and sustainability, creating an Organic Embedded System (OES). A fault tolerance model called N Modular Redundancy with Standby (NMRSB) is developed which combines the two popular fault tolerance techniques of NMR and Standby fault tolerance in order to facilitate the CBE approach. This dissertation develops two of instances of the NMRSB system - Triple Modular Redundancy with Standby (TMRSB) and Duplex with Standby (DSB). A hypothetical Xilinx Virtex-II Pro FPGA model demonstrates their viability for various applications including a 3-bit x 3-bit multiplier, and the MCNC91 benchmark circuits. Experiments conducted on the model iii evaluate the performance of three new genetic operators and demonstrate progress towards a completely self-contained single-chip implementation so that the FPGA can refurbish itself without requiring a PC host to execute the Genetic Algorithm. This dissertation presents results from the simulations of multiple applications with a CBE model implemented in the C++ programming language. Starting with an initial population of 20 and 30 viable configurations for TMRSB and DSB respectively, a single stuck-at fault is introduced in the logic resources. Fault refurbishment experiments are conducted under supervision of CBE using a fitness state evaluation function based on competing outputs, fitness adjustment, and different level threshold. The device remains online throughout the process by which a complete repair is realized with Hamming Distance and Bitweight voting schemes. The results indicate a Hamming Distance TMRSB approach can prevent the most pervasive fault impacts and realize complete refurbishment. Experimental results also show that the Autonomic Layer demonstrates 100% faulty component isolation for both Functional Elements (FEs) and Autonomous Elements (AEs) with randomly injected single and multiple faults. Using logic circuits from the MCNC-91 benchmark set, availability during repair phases averaged 75.05%, 82.21%, and 65.21% for the z4ml, cm85a, and cm138a circuits respectively under stated conditions. In addition to simulation, the proposed OES architecture synthesized from HDL was prototyped on a Xilinx Virtex II Pro FPGA device supporting partial reconfiguration to demonstrate the feasibility for intrinsic regeneration of the selected circuit.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2008

Advisor

DeMara, Ronald

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Electrical Engineering and Computer Science

Degree Program

Computer Engineering

Format

application/pdf

Identifier

CFE0002280

URL

http://purl.fcla.edu/fcla/etd/CFE0002280

Language

English

Release Date

September 2008

Length of Campus-only Access

None

Access Status

Doctoral Dissertation (Open Access)

Restricted to the UCF community until September 2008; it will then be open access.

Share

COinS