ORCID

https://orcid.org/0009-0008-7805-0311

Keywords

StarCraft, Macromanagement and Micromanagement, TransMix, Generative AI in Games, SC Text Dataset, Multi-agent Reinforcement Learning

Abstract

Transformers have emerged as the driving force behind innovative advancements in artificial intel- ligence (AI), particularly in the domain of generative AI. Originally designed for natural language processing, transformers have since demonstrated remarkable versatility across various domains, including computer vision and reinforcement learning. This dissertation investigates the application of transformer-based models to decision-making in the real-time strategy game, StarCraft II. StarCraft II (SC2) serves as an ideal platform for developing and testing AI agents due to its complexity and strategic depth. SC2 requires players not only to fight skirmishes (micromanagement) but also to build an army and construct fortifications (macromanagement). The research presented in this thesis utilizes advanced transformer-based architectures for addressing critical SC challenges in both macromanagement and micromanagement tasks.

The initial contribution of this thesis focuses on macromanagement prediction in SC2, where we introduce a transformer-based neural architecture designed to predict global game state and build orders. This architecture surpasses traditional models, such as Gated Recurrent Units (GRUs), as demonstrated by its superior performance on the MSC dataset. Extensive ablation studies validate the design choices, highlighting the model’s ability to generalize effectively in transfer learning scenarios.

The research then extends into the domain of Multi-Agent Reinforcement Learning (MARL), with a focus on cooperative and competitive micromanagement tasks in SC2. Leveraging the StarCraft Multi-Agent Challenge (SMAC) as a benchmark, we propose TransMix, a transformer-based joint action-value mixing network for cooperative MARL. TransMix learns an intricate mixing function, significantly outperforming state-of-the-art cooperative MARL algorithms across a wide range of SMAC scenarios, including challenging ones. Moreover, TransMix exhibits enhanced robustness when evaluated under adverse conditions, such as states corrupted with Gaussian noise to simulate the fog-of-war.

A key insight uncovered during the exploration of transformer-based RL methods is the challenge of sample efficiency, particularly in environments with sparse reward structures. To address this, we propose a self-attention and bootstrapping-based approach that improves sample efficiency in ensemble Q-networks. Our method not only reduces overestimation bias but also achieves state- of-the-art performance, even under constrained update-to-data ratios.

Finally, the thesis explores the broader applicability of generative AI in gaming. We introduce SC2-Phi2, a multimodal model utilizing Microsoft’s Small Language Model (Phi-2) and a vision transformer to extract textual representations from spatial features. This model is fine-tuned on a single GPU and can generate SC2 game actions based on the current game stage and resource availability, surpassing previous benchmarks in terms of efficiency and performance.

Completion Date

2024

Semester

Fall

Committee Chair

Sukthankar, Gita

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Computer Science

Degree Program

Computer Science

Format

PDF

Identifier

DP0029001

Language

English

Release Date

12-15-2024

Access Status

Dissertation

Campus Location

Orlando (Main) Campus

Accessibility Status

PDF accessibility verified using Adobe Acrobat Pro Accessibility Checker

Share

COinS