Abstract

The first part of the dissertation studies a density deconvolution problem with small Berkson errors. In this setting, the data is not available directly but rather in the form of convolution and one needs to estimate the convolution of the unknown density with Berkson errors. While it is known that the Berkson errors improve the precision of the reconstruction, it does not necessarily happen when Berkson errors are small. Furthermore, the choice of bandwidth in density estimation has been an open problem so far. In this dissertation, we provide an in-depth study of the choice of the bandwidth which leads to the optimal error rates. The second part of the dissertation studies a generative network model, the so-called Popularity Adjusted Block Model (PABM) introduced by Sengupta and Chen (2018). The PABM generalizes popular graph generative models such as the Stochastic Block Model (SBM) and the Degree Corrected Block Model (DCBM). The advantages of the PABM is that, unlike mixed membership models or the DCBM, it does not rely on any identifiability conditions, and leads to more flexible spectral properties. We expand the theory of PABM to the case of an arbitrary number of communities which possibly grows with a number of nodes in the network and is not assumed to be known. We produce the estimators of the probability matrix and the community structure and provide non-asymptotic upper bounds for the estimation and the clustering errors. Majority of real-life networks are sparse, in the sense that they have few high degree nodes while the rest of the nodes have low degrees. Since the SBM and DCBM do not allow to set any probabilities of connections to zero, they model sparsity by enforcing the maximum connection probability to be bounded above by a small quantity which precludes existence of high degree nodes. On the contrary, the PABM allows modeling some of the probabilities of connections between the nodes as identical zeros while maintaining the rest of the probabilities non-negligible. This leads to the Sparse Popularity Adjusted Block Model (SPABM). The SPABM reduces the size of parameter set and leads to improved precision of estimation and clustering. We produce the estimators of the probability matrix and the community structure in SPABM. Finally, we provide non-asymptotic upper bounds for the estimation and the clustering errors.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2020

Semester

Summer

Advisor

Pensky, Marianna

Degree

Doctor of Philosophy (Ph.D.)

College

College of Sciences

Department

Mathematics

Degree Program

Mathematics

Format

application/pdf

Identifier

CFE0008225; DP0023579

URL

https://purls.library.ucf.edu/go/DP0023579

Language

English

Release Date

August 2020

Length of Campus-only Access

None

Access Status

Doctoral Dissertation (Open Access)

STARS Citation

Rimal, Ramchandra, "Estimation and Clustering in Network and Indirect Data" (2020). Electronic Theses and Dissertations, 2020-2023. 276.
https://stars.library.ucf.edu/etd2020/276

Download

Included in

Mathematics Commons

COinS

Electronic Theses and Dissertations, 2020-2023

Estimation and Clustering in Network and Indirect Data

Abstract

Notes

Graduation Date

Semester

Advisor

Degree

College

Department

Degree Program

Format

Identifier

URL

Language

Release Date

Length of Campus-only Access

Access Status

STARS Citation

Included in

Browse Advisors

Explore

Connect

Electronic Theses and Dissertations, 2020-2023

Estimation and Clustering in Network and Indirect Data

Author

Abstract

Notes

Graduation Date

Semester

Advisor

Degree

College

Department

Degree Program

Format

Identifier

URL

Language

Release Date

Length of Campus-only Access

Access Status

STARS Citation

Included in

Share

Browse Advisors

Explore

Connect