The desire to understand the causes of complex societal phenomena is fundamental to the social sciences. Society, at a macro-scale has many measurable characteristics in the form of statistical distributions and aggregate measures; data which is increasingly abundant with the proliferation of online social media, mobile devices, and the internet of things. However, the decision-making processes and limits of the individuals who interact to generate these statistical patterns are often difficult to unravel. Furthermore, multiple causal factors often interact to determine the outcome of a particular behavior. Quantifying the importance of these causal factors and their interactions, which make up a particular decision-making process, towards a societal outcome of interest helps extract explanations that provide a deeper understanding of social behavior. Holistic, generative modeling techniques, in particular agent-based modeling, are able to 'grow' artificial societies that replicate emergent patterns seen in the real world. Driving the autonomous agents of these models are rules, generalized hypotheses of human behavior, which upon validation against real-world data, help assemble theories of human behavior. Yet often, multiple hypothetical causal factors can be suggested for the construction of these rules. With traditional agent-based modeling, it is often up to the modeler's discretion to decide which combination of factors best represent the rule at hand. Yet, due to the aforementioned lack of insight, the modeled agent rule is often one out of a vast space of possible rules. In this dissertation, I introduce Evolutionary Model Discovery, a novel framework for automated causal inference, which treats such artificial societies as sandboxes for rule discovery and causal factor importance evaluation. Evolutionary Model Discovery consists of two major phases. Firstly, a rule of interest of a given agent-based model is genetically programmed with combinations of hypothesized factors, attempting to find rules which enable the agent-based model to more closely mimic real-world phenomena. Secondly, the data produced through genetic programming, regarding the correspondence of factor presence in the rule to fitness, is used to train a random forest regressor for importance evaluation. Besides its scientific contributions, this work has also led to the contribution of two Python open-source software libraries for high performance computing with NetLogo, Evolutionary Model Discovery and NL4Py. The results of applying Evolutionary Model Discovery for the causal inference of three very different cases of human social behavior are discussed, revisiting the rules underlying two widely studied models in the literature, the Artificial Anasazi and Schelling's Segregation, and an ensemble model of diffusion of information and information overload. First, previously unconsidered factors driving the socio-agricultural behavior of an ancient Pueblo society are discovered, assisting in the construction of a more robust and accurate version of the Artificial Anasazi model. Second, factors that contribute to the coexistence of mixed patterns of segregation and integration are discovered on a recent extension of Schelling's Segregation model. Finally, causal factors important to the prioritization of social media notifications under loss of attention due to information overload are discovered on an ensemble of a model of Extended Working Memory and the Multi-Action Cascade Model of conversation.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date





Garibay, Ivan


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science

Degree Program

Modeling & Simulation




CFE0008276; DP0023647



Release Date


Length of Campus-only Access

1 year

Access Status

Doctoral Dissertation (Open Access)

Included in

Sociology Commons