evolutionary computation, genetic algorithms, proportional representations, proteomics approach


As the complexity of our society and computational resources increases, so does the complexity of the problems that we approach using evolutionary search techniques. There are recent approaches to deal with the problem of scaling evolutionary methods to cope with highly complex difficult problems. Many of these approaches are biologically inspired and share an underlying principle: a problem representation based on basic representational building blocks that interact and self-organize into complex functions or designs. The observation from the central dogma of molecular biology that proteins are the basic building blocks of life and the recent advances in proteomics on analysis of structure, function and interaction of entire protein complements, lead us to propose a unifying framework of thought for these approaches: the proteomics approach. This thesis propose to investigate whether the self-organization of protein analogous structures at the representation level can increase the degree of complexity and ``novelty'' of solutions obtainable using evolutionary search techniques. In order to do so, we identify two fundamental aspects of this transition: (1) proteins interact in a three dimensional medium analogous to a multiset; and (2) proteins are functional structures. The first aspect is foundational for understanding of the second. This thesis analyzes the first aspect. It investigates the effects of using a genome to proteome mapping on evolutionary computation. This analysis is based on a genetic algorithm (GA) with a string to multiset mapping that we call the proportional genetic algorithm (PGA), and it focuses on the feasibility and effectiveness of this mapping. This mapping leads to a fundamental departure from typical EC methods: using a multiset of proteins as an intermediate mapping results in a \emph{completely location independent} problem representation where the location of the genes in a genome has no effect on the fitness of the solutions. Completely location independent representations, by definition, do not suffer from traditional EC hurdles associated with the location of the genes or positional effect in a genome. Such representations have the ability to self-organize into a genomic structure that appears to favor positive correlations between form and quality of represented solutions. Completely location independent representations also introduce new problems of their own such as the need for large alphabets of symbols and the theoretical need for larger representation spaces than traditional approaches. Overall, these representations perform as well or better than traditional representations and they appear to be particularly good for the class of problems involving proportions or multisets. This thesis concludes that the use of protein analogous structures as an intermediate representation in evolutionary computation is not only feasible but in some cases advantageous. In addition, it lays the groundwork for further research on proteins as functional self-organizing structures capable of building increasingly complex functionality, and as basic units of problem representation for evolutionary computation.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at

Graduation Date





Wu, Annie


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science


Computer Science

Degree Program

Computer Science








Release Date

December 2004

Length of Campus-only Access


Access Status

Doctoral Dissertation (Open Access)