Title

Efficient Methods For Sampling Responses From Large-Scale Qualitative Data

Keywords

Consumer-generated content; Consumer-generated media; Customer feedback on the web; Large-scale qualitative data sets; Qualitative comments; Sampling open-ended questions; Text mining

Abstract

The World Wide Web contains a vast corpus of consumer-generated content that holds invaluable insights for improving the product and service offerings of firms. Yet the typical method for extracting diagnosticinfor mation from online content-text mining-has limitations. As a starting point, we propose analyzing a sample of comments before initiating text mining. Using a combination of real data and simulations, we demonstrate that a sampling procedure that selects respondents whose comments contain a large amount of information is superior to the two most popular sampling methods-simple random sampling and stratified random sampling--in gaining insights from the data. In addition, we derive a method that determines the probability of observing diagnosticinfor mation repeated a specificnumber of times in the population, which will enable managers to base sample size decisions on the trade-off between obtaining additional diagnostic information and the added expense of a larger sample. We provide an illustration of one of the methods using a real data set from a website containing qualitative comments about staying at a hotel and demonstrate how sampling qualitative comments can be a useful first step in text mining. ©2011 INFORMS.

Publication Date

5-1-2011

Publication Title

Marketing Science

Volume

30

Issue

3

Number of Pages

532-549

Document Type

Article

Personal Identifier

scopus

DOI Link

https://doi.org/10.1287/mksc.1100.0632

Socpus ID

79957661901 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/79957661901

This document is currently not available here.

Share

COinS