Abstract
Free content websites that provide free books, music, games, movies, etc., have existed on the Internet for many years. While it is a common belief that such websites might be different from premium websites providing the same content types in terms of their security, a rigorous analysis that supports this belief is lacking from the literature. In particular, it is unclear if those websites are as safe as their premium counterparts. In this dissertation, we set out to investigate the similarities and differences between free content and premium websites, including their risk profiles. Moreover, we analyze and quantify through measurements the potential vulnerability of free content websites. For this purpose, we compiled a dataset of free content websites offering books, games, movies, music, and software. For comparison purposes, we also sampled a dataset of premium content websites, where users need to pay for using the service for the same type of content. For our modality of analysis, we use the SSL certificate's public information, HTTP header information, reported privacy and data sharing practices, top-level domain information, and website files and loaded scripts. The analysis is not straightforward, and en route, we address various challenges, including labeling and annotation, privacy policy understanding through a highly accurate pre-trained language model using advanced ensemble-based classification technique at the sentence and paragraph level, and data augmentation through various sources. This dissertation delivers various significant findings and conclusions concerning the security of free content websites. Our findings raise several concerns, including that the reported privacy policies may not reflect the data collection practices used by service providers, and pronounced biases across privacy policy categories. Overall, our study highlights that while there are no explicit costs associated with those websites, the cost is often implicit, in the form of compromised security and privacy.
Notes
If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu
Graduation Date
2023
Semester
Summer
Advisor
Mohaisen, David
Degree
Doctor of Philosophy (Ph.D.)
College
College of Engineering and Computer Science
Department
Computer Science
Degree Program
Computer Science
Format
application/pdf
Identifier
CFE0009641; DP0027465
URL
https://purls.library.ucf.edu/go/DP0027465
Language
English
Release Date
February 2024
Length of Campus-only Access
1 year
Access Status
Doctoral Dissertation (Open Access)
STARS Citation
Alabduljabbar, Abdulrahman, "Towards a Holistic and Comparative Analysis of the Free Content Web: Security, Privacy, and Performance" (2023). Electronic Theses and Dissertations, 2020-2023. 1499.
https://stars.library.ucf.edu/etd2020/1499