Keywords

Ethical AI Decision-Making; Reinforcement Learning with Human Feedback (RLHF); AI Ethics and Human Values; Robust; Deep Reinforcement Learning; Artificial Intelligence

Abstract

The emergence of reinforcement learning from human feedback (RLHF) has made great strides toward giving AI decision-making the ability to learn from external human advice. In general, this machine learning technique is concerned with producing agents that learn to work toward optimizing and achieving some goal, advanced by interactions with the environment and feedback given in terms of a quantifiable reward. In the scope of this project, we seek to merge the intricate realms of AI robustness, ethical decision-making, and RLHF. With no way to truly quantify human values, human feedback is an essential bridge in the learning process, allowing AI models to reflect better ethical principles rather than just replicating human behavior. By exploring the transformative potential of RLHF in AI-human interactions, acknowledging the dynamic nature of human behavior beyond simplistic models, and emphasizing the necessity for ethically framed AI systems, this thesis constructs a deep reinforcement learning framework that is not only robust but also well aligned with human ethical standards. Through a methodology that incorporates simulated ethical dilemmas and evaluates AI decisions against established ethical frameworks, the focus is to contribute significantly to the understanding and application of RLHF in creating AI systems that embody robustness and ethical integrity.

Thesis Completion Year

2024

Thesis Completion Semester

Fall

Thesis Chair

Wang, Yue

College

College of Engineering and Computer Science

Department

Electrical and Computer Engineering

Thesis Discipline

Computer Engineering

Language

English

Access Status

Open Access

Length of Campus Access

None

Campus Location

Orlando (Main) Campus

Subjects

Reinforcement learning (Machine learning); Artificial intelligence--Moral and ethical aspects; Artificial intelligence--Research; Human-robot interaction; Decision making--Moral and ethical aspects--Study and teaching

STARS Citation

Plasencia, Marco M., "Reinforcement Learning From Human Feedback For Ethically Robust Ai Decision-Making" (2024). Honors Undergraduate Theses. 212.
https://stars.library.ucf.edu/hut2024/212

Download

Included in

Computational Engineering Commons, Computer and Systems Architecture Commons

COinS

Accessibility Statement

This item was created or digitized prior to April 24, 2027, or is a reproduction of legacy media created before that date. It is preserved in its original, unmodified state specifically for research, reference, or historical recordkeeping. In accordance with the ADA Title II Final Rule, the University Libraries provides accessible versions of archival materials upon request. To request an accommodation for this item, please submit an accessibility request form.

Rights Statement

Honors Undergraduate Theses

Reinforcement Learning From Human Feedback For Ethically Robust Ai Decision-Making

Keywords

Abstract

Thesis Completion Year

Thesis Completion Semester

Thesis Chair

College

Department

Thesis Discipline

Language

Access Status

Length of Campus Access

Campus Location

Subjects

STARS Citation

Included in

Accessibility Statement

Rights Statement

Browse Advisors

Explore

Connect

Honors Undergraduate Theses

Reinforcement Learning From Human Feedback For Ethically Robust Ai Decision-Making

Author

Keywords

Abstract

Thesis Completion Year

Thesis Completion Semester

Thesis Chair

College

Department

Thesis Discipline

Language

Access Status

Length of Campus Access

Campus Location

Subjects

STARS Citation

Included in

Share

Accessibility Statement

Rights Statement

Browse Advisors

Explore

Connect