Abstract

Incident reporting systems are an integral part of any organization seeking to increase the safety of their operation by gathering data on past events, which can then be used to identify ways of mitigating similar events in the future. In order to analyze trends and common issues with regards to the human element in the system, reports are often classified according to a human factors taxonomy. Lately, machine learning algorithms have become popular tools for automated classification of text; however, performance of such algorithms varies and is dependent on several factors. In supervised machine learning tasks such as text classification, the algorithm is trained with features and labels, where the features here are a function of the incident reports themselves and the labels are supplied by a human annotator, whether that is the reporter or a third person. Aside from the intricacies of building and tuning machine learning models, a subjective classification according to a human factors taxonomy can generate considerable noise and bias. I examined the interdependencies between the features of incident reports, the subjective labeling process, the constraints that the taxonomy itself imposes, and basic characteristics of human factors taxonomies that can influence human, as well as automated, classification. In order to evaluate these challenges, I trained a machine learning classifier on 17,253 incident reports from the NASA Aviation Safety Reporting System (ASRS) using multi-label classification, and collected labels from six human annotators for a subset of 400 incident reports each, resulting in a total of 2,400 individual annotations. Results show that, in general, reliability of annotation for the set of incident reports selected in this study was comparatively low. It was also evident that some human factors labels were more agreed upon than others, sometimes related to the presence of key words in the reports which map directly to the label. Performance of machine learning annotation followed patterns of human agreement on labels. The high variability of content and quality of narratives has been identified as a major factor for difficulties in annotation. Suggestions on how to improve the data collection and labeling process are provided.

Graduation Date

2020

Semester

Fall

Advisor

Jentsch, Florian

Degree

Doctor of Philosophy (Ph.D.)

College

College of Sciences

Department

Psychology

Degree Program

Psychology; Human Factors Cognitive Psychology

Format

application/pdf

Identifier

CFE0008302

Language

English

Release Date

December 2021

Length of Campus-only Access

1 year

Access Status

Doctoral Dissertation (Campus-only Access)

Share

COinS