Title
Efficient Methods For Sampling Responses From Large-Scale Qualitative Data
Keywords
Consumer-generated content; Consumer-generated media; Customer feedback on the web; Large-scale qualitative data sets; Qualitative comments; Sampling open-ended questions; Text mining
Abstract
The World Wide Web contains a vast corpus of consumer-generated content that holds invaluable insights for improving the product and service offerings of firms. Yet the typical method for extracting diagnosticinfor mation from online content-text mining-has limitations. As a starting point, we propose analyzing a sample of comments before initiating text mining. Using a combination of real data and simulations, we demonstrate that a sampling procedure that selects respondents whose comments contain a large amount of information is superior to the two most popular sampling methods-simple random sampling and stratified random sampling--in gaining insights from the data. In addition, we derive a method that determines the probability of observing diagnosticinfor mation repeated a specificnumber of times in the population, which will enable managers to base sample size decisions on the trade-off between obtaining additional diagnostic information and the added expense of a larger sample. We provide an illustration of one of the methods using a real data set from a website containing qualitative comments about staying at a hotel and demonstrate how sampling qualitative comments can be a useful first step in text mining. ©2011 INFORMS.
Publication Date
5-1-2011
Publication Title
Marketing Science
Volume
30
Issue
3
Number of Pages
532-549
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.1287/mksc.1100.0632
Copyright Status
Unknown
Socpus ID
79957661901 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/79957661901
STARS Citation
Singh, Surendra N.; Hillmer, Steve; and Wang, Ze, "Efficient Methods For Sampling Responses From Large-Scale Qualitative Data" (2011). Scopus Export 2010-2014. 3524.
https://stars.library.ucf.edu/scopus2010/3524