Efficient Methods for Sampling Responses from Large-Scale Qualitative Data

Authors

    Authors

    S. N. Singh; S. Hillmer;Z. Wang

    Comments

    Authors: contact us about adding a copy of your work at STARS@ucf.edu

    Abbreviated Journal Title

    Mark. Sci.

    Keywords

    consumer-generated media; consumer-generated content; customer feedback; on the Web; text mining; qualitative comments; large-scale qualitative; data sets; sampling open-ended questions; Business

    Abstract

    The World Wide Web contains a vast corpus of consumer-generated content that holds invaluable insights for improving the product and service offerings of firms. Yet the typical method for extracting diagnostic information from online content-text mining-has limitations. As a starting point, we propose analyzing a sample of comments before initiating text mining. Using a combination of real data and simulations, we demonstrate that a sampling procedure that selects respondents whose comments contain a large amount of information is superior to the two most popular sampling methods-simple random sampling and stratified random sampling-in gaining insights from the data. In addition, we derive a method that determines the probability of observing diagnostic information repeated a specific number of times in the population, which will enable managers to base sample size decisions on the trade-off between obtaining additional diagnostic information and the added expense of a larger sample. We provide an illustration of one of the methods using a real data set from a website containing qualitative comments about staying at a hotel and demonstrate how sampling qualitative comments can be a useful first step in text mining.

    Journal Title

    Marketing Science

    Volume

    30

    Issue/Number

    3

    Publication Date

    1-1-2011

    Document Type

    Article

    Language

    English

    First Page

    532

    Last Page

    549

    WOS Identifier

    WOS:000291010200012

    ISSN

    0732-2399

    Share

    COinS