Abstract

Free content websites that provide free books, music, games, movies, etc., have existed on the Internet for many years. While it is a common belief that such websites might be different from premium websites providing the same content types in terms of their security, a rigorous analysis that supports this belief is lacking from the literature. In particular, it is unclear if those websites are as safe as their premium counterparts. In this dissertation, we set out to investigate the similarities and differences between free content and premium websites, including their risk profiles. Moreover, we analyze and quantify through measurements the potential vulnerability of free content websites. For this purpose, we compiled a dataset of free content websites offering books, games, movies, music, and software. For comparison purposes, we also sampled a dataset of premium content websites, where users need to pay for using the service for the same type of content. For our modality of analysis, we use the SSL certificate's public information, HTTP header information, reported privacy and data sharing practices, top-level domain information, and website files and loaded scripts. The analysis is not straightforward, and en route, we address various challenges, including labeling and annotation, privacy policy understanding through a highly accurate pre-trained language model using advanced ensemble-based classification technique at the sentence and paragraph level, and data augmentation through various sources. This dissertation delivers various significant findings and conclusions concerning the security of free content websites. Our findings raise several concerns, including that the reported privacy policies may not reflect the data collection practices used by service providers, and pronounced biases across privacy policy categories. Overall, our study highlights that while there are no explicit costs associated with those websites, the cost is often implicit, in the form of compromised security and privacy.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2023

Semester

Summer

Advisor

Mohaisen, David

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Computer Science

Degree Program

Computer Science

Format

application/pdf

Identifier

CFE0009641; DP0027465

URL

https://purls.library.ucf.edu/go/DP0027465

Language

English

Release Date

February 2024

Length of Campus-only Access

1 year

Access Status

Doctoral Dissertation (Open Access)

Share

COinS