ORCID
https://orcid.org/0009-0008-7805-0311
Keywords
StarCraft, Macromanagement and Micromanagement, TransMix, Generative AI in Games, SC Text Dataset, Multi-agent Reinforcement Learning
Abstract
Transformers have emerged as the driving force behind innovative advancements in artificial intel- ligence (AI), particularly in the domain of generative AI. Originally designed for natural language processing, transformers have since demonstrated remarkable versatility across various domains, including computer vision and reinforcement learning. This dissertation investigates the application of transformer-based models to decision-making in the real-time strategy game, StarCraft II. StarCraft II (SC2) serves as an ideal platform for developing and testing AI agents due to its complexity and strategic depth. SC2 requires players not only to fight skirmishes (micromanagement) but also to build an army and construct fortifications (macromanagement). The research presented in this thesis utilizes advanced transformer-based architectures for addressing critical SC challenges in both macromanagement and micromanagement tasks.
The initial contribution of this thesis focuses on macromanagement prediction in SC2, where we introduce a transformer-based neural architecture designed to predict global game state and build orders. This architecture surpasses traditional models, such as Gated Recurrent Units (GRUs), as demonstrated by its superior performance on the MSC dataset. Extensive ablation studies validate the design choices, highlighting the model’s ability to generalize effectively in transfer learning scenarios.
The research then extends into the domain of Multi-Agent Reinforcement Learning (MARL), with a focus on cooperative and competitive micromanagement tasks in SC2. Leveraging the StarCraft Multi-Agent Challenge (SMAC) as a benchmark, we propose TransMix, a transformer-based joint action-value mixing network for cooperative MARL. TransMix learns an intricate mixing function, significantly outperforming state-of-the-art cooperative MARL algorithms across a wide range of SMAC scenarios, including challenging ones. Moreover, TransMix exhibits enhanced robustness when evaluated under adverse conditions, such as states corrupted with Gaussian noise to simulate the fog-of-war.
A key insight uncovered during the exploration of transformer-based RL methods is the challenge of sample efficiency, particularly in environments with sparse reward structures. To address this, we propose a self-attention and bootstrapping-based approach that improves sample efficiency in ensemble Q-networks. Our method not only reduces overestimation bias but also achieves state- of-the-art performance, even under constrained update-to-data ratios.
Finally, the thesis explores the broader applicability of generative AI in gaming. We introduce SC2-Phi2, a multimodal model utilizing Microsoft’s Small Language Model (Phi-2) and a vision transformer to extract textual representations from spatial features. This model is fine-tuned on a single GPU and can generate SC2 game actions based on the current game stage and resource availability, surpassing previous benchmarks in terms of efficiency and performance.
Completion Date
2024
Semester
Fall
Committee Chair
Sukthankar, Gita
Degree
Doctor of Philosophy (Ph.D.)
College
College of Engineering and Computer Science
Department
Computer Science
Degree Program
Computer Science
Format
Identifier
DP0029001
Language
English
Release Date
12-15-2024
Access Status
Dissertation
Campus Location
Orlando (Main) Campus
STARS Citation
Khan, Muhammad Junaid, "Transformer-based Methods for StarCraft Macromanagement And Multi-Agent Reinforcement Learning" (2024). Graduate Thesis and Dissertation post-2024. 38.
https://stars.library.ucf.edu/etd2024/38
Accessibility Status
PDF accessibility verified using Adobe Acrobat Pro Accessibility Checker