A long standing goal in the field of artificial intelligence (AI) is to develop agents that can perceive richer problem space and effortlessly plan their activity in minimal duration. Several strides have been made towards this goal over the last few years due to simultaneous advances in compute power, optimized algorithms, and most importantly evident success of AI based machines in nearly every discipline. The progress has been especially rapid in area of reinforcement learning (RL) where computers can now plan-ahead their activities and outperform their human rivals in complex problem domains like chess or Go game. However, despite encouraging progress, most of the advances in RL-based planning still take place in deterministic context (e.g. constant grid size, known action sets, etc.) which does not adapts well to stochastic variations in problem domain. In this dissertation we develop techniques that enable self-adaptation of agent's behavioral policy when exposed to variations in problem domain. In particular, first we introduce an initial model that loosely realizes problem domain's characteristics. The domain characteristics are embedded into a common multi-modal embedding space set. The embedding space set then allows us to identify initial beliefs and establish prior distributions without being constrained to only finite collection of agent's state-action-reward experiences to choose from. We describe a learning technique that adapts to variations in problem domain by retaining only salient features of preceding domains, and inferring posterior for newly introduced variation as direct perturbation to aggregated priors. Besides having theoretical guarantees, we demonstrate end-to-end solution by establishing FPGA-based recurrent neural network, that can change its synaptic architecture temporally, thus eliminating the need of maintaining dual networks. We argue that our hardware based neural implementation has practical benefits, due to the fact it only uses sparse network architecture and multiplex it on circuit level to exhibit recurrence, which can reduce inference latency on circuit-level, while maintaining equivalence to dense neural architecture.
If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu.
Doctor of Philosophy (Ph.D.)
College of Engineering and Computer Science
Electrical and Computer Engineering
Length of Campus-only Access
Doctoral Dissertation (Open Access)
Raza, Sayyed Jaffar Ali, "Self Adaptive Reinforcement Learning for High-Dimensional Stochastic Systems with Application to Robotic Control" (2021). Electronic Theses and Dissertations, 2020-. 919.