Elisabed Mamradze & Leonie Gehrmann
Understanding consumer needs and preferences is crucial for several reasons. For instance, it allows firms to develop market segmentation strategies based on the differences and similarities in consumer opinion (Timoshenko and Hauser, 2019 p. 1). Also, it gives firms the ability to evaluate current performance and recognize opportunities to improve existing products or develop new ones (Timoshenko and Hauser, 2019 p. 1). All in all, consumer opinion is a major driver of firm decisions, ultimately determining profit.
Recognizing the importance of accurately assessing such attitudes, this article compares challenges and advantages of traditional methods used to capture consumer opinion with newer approaches that make use of user-generated content (UGC). In the analysis, biases are considered as the main determinant of data quality. Commonly, such biases are grouped into those associated with the design and structure of the data collection method and those explained by the social context and consumer characteristics or behavior.
Traditional Data Collection Approaches
Overall, these methods can be grouped into oral interviews and written surveys. Initially, researchers used face-to-face or telephone interviews, as well as pen-and-paper questionnaires (Timoshenko and Hauser, 2019 p. 2). Later, with the increased availability of the internet, online surveys were also conducted to collect data efficiently. Telephone interviews are cheaper and easier to organize (Locander and Burton, 1976 p. 189). Also, respondents tend to be more honest because the process of interviewing is more impersonal and provides a greater sense of anonymity (Falthzik, 1972 p. 451). Nonetheless, if trust between the interviewer and respondent is established, sensitive personal subjects might be revealed more in the face-to-face context (Herman, 1977 p. 403). In contrast to interviews, written surveys do not involve human interactions, providing a higher degree of anonymity and allowing researchers to address large and widely dispersed samples with potentially sensitive issues (Labrecque, 1978 p. 82 Dickson and MaClachlan, 1996 p. 113). Thereby, online surveys are cheaper, faster and generate a higher response rate than those administered by mail (Deutskens, de Ruyter and Wetzels, 2006, p. 346).
Most biases that affect the quality of data gathered through these approaches are explained by the method design. One important such issue arises from the nonresponse bias, which occurs if participants differ substantially from those that do not respond or drop out of the study (Armstrong and Overton, 1977 p. 396). Interviews most often suffer of this bias when difficulties reaching respondents arise. Studies show a significant relationship between weekdays and telephone interview completion rates (Falthzik, 1972 p. 451). The results suggest that the response rate is highest in the morning and the lowest on Fridays, while being likely to increase with second calls although this costs additional time (Falthzik, 1972 p. 451, Boyd and Westfall, 1970 p. 249). Non-response rates generally tend to be very high for surveys administered by mail (Labrecque, 1978 p. 82 Dickson and MaClachlan, 1996 p. 113).
Another commonly encountered phenomenon is the interpretation bias (Gal and Rucker, 2011 p. 186). It occurs when respondents fail to correctly understand or are confused about what they are asked to do and can be caused by the wording of questions in a survey or misinterpretations of the respondent by the interviewer, for instance (Locander and Burton, 1976 p. 192, Bailar, Bailey and Stevens, 1977 p. 337).
Finally, interviewer bias refers to the effect of the interviewer’s personal or physical characteristics and behavior on respondents (Lovelock, Stiff, Cullwick and Kaufman, 1976 p. 362). To account for this, specific instructions should be established in advance to ensure consistency among interviewers. Indeed, studies show that data is more precise if the interviewer has clear guidelines on the presentation of survey goals and on the provision of clear feedback to respondents on the adequacy of their answers (Cannell, Oksenberg and Converse, 1977 p. 310).
A bias that is independent from the collection method and instead explained by consumer behavior can also occur. The social desirability bias refers to respondents’ tendency to conform to the social norms and expectations instead of providing answers that reflect their actual attitudes, beliefs, and thoughts (Schoenmueller, Netzer and Stahl, 2020 p. 867; Fisher, 1993 p. 303). A study shows that evaluations of socially sensitive outcomes differ between indirect and direct questioning. (Fisher, 1993 p. 313). Furthermore, researchers find that the greater feeling of anonymity felt in online surveys can reduce the importance respondents assign to the social context, driving them to provide more impulsive and self-centred answers (Deutskens, de Ruyter and Wetzels, 2006, p. 352). These results suggest that the occurrence of the social desirability bias depends on both the questioning method and the degree of anonymity and can be reduced by indirect questioning (Fisher, 1993 p. 313).
Data Collection Approaches Using UCG
Rapid technological change has tremendously affected the development of markets and shaped the interactions of firms and consumers. With a variety and availability of digital channels, it has become increasingly challenging to track and structure consumer-related data, as well as capture and interpret consumer opinion. The increasing popularity of social media has given rise to an explosion of text data generated by consumers, in the form of reviews, tweets, or blogs for instance (Humphreys and Wang, 2017 p. 1274). Such information shared online offers firms unique possibilities to learn about the underlying values and preferences of consumers. Beyond giving insight ito actual opinion, these texts also reflect on authors’ personal characteristics and the underlying context (Berger et al., 2019 p. 2).
In contrast to more traditional approaches, the use of UCG to capture consumer opinions is mainly affected by biases that are explained by consumer behavior. Indeed, consumer reviews often exhibit self-selection bias and polarity of review distribution, questioning their ability to reflect the overall opinion in the entire population (Schoenmueller, Netzer and Stahl, 2020 p. 873). This bias is reflected in the observation that a consumer’s decision to share an opinion depends on a variety of factors including the product experience, existing reviews and costs associated with publishing the review (Schoenmueller, Netzer and Stahl, 2020 p. 856). Authors of reviews tend to be consumers who have already purchased the product and have either extremely positive or negative experiences (Moe and Schweidel, 2012 p. 383).
Another common phenomenon is negativity bias, which captures the tendency to discount positive information and place more emphasis on negative aspects (Chen and Lurie, 2013 p. 464). Individuals tend to think that positive reviews are published to please others, whereas negative information is believed to be more objective (Chen and Lurie, 2013 p. 464). On average, active reviewers tend to be negative and offer a differentiated opinion (Schoenmueller, Netzer and Stahl, 2020 p. 874, Moe and Schweidel, 2012 p. 385). Also, negative reviews can easily become contagious, dissolving quickly and becoming viral (Herhausen et al., 2019 p. 16). Besides potentially impacting purchase decisions, negative reviews might even lead to others discounting positive attributes of a product when publishing their own review (Chen and Lurie, 2013 p. 4).
In the online context, the decision to post a review, as well as the content and language used is often also by affected by the social desirability bias (Humphreys and Wang, 2017 p. 1280, Schoenmueller, Netzer and Stahl, 2020 p. 867). The opinions consumers share with their closest social connections is less impacted by self-presentation motives and differs from what is posted online for the public to see (Berger et al., 2019 p. 5). Additionally, consumers might publish reviews with certain content solely for the purpose of becoming part of a particular community or winning a certain reward (Hughes, Swaminathan and Brooks, 2019 p. 92).
Nonetheless, the use of UGC to capture consumer opinion can also fall prey to a bias arising from the applied method. In the online environment, interpretation bias refers to failures in the text analysis and interpretation of consumer reviews. (Gal and Rucker, 2011 p. 186). The selection of an appropriate text analysis approach is necessary to avoid such issues. For example, word extraction is used frequently, as it is relatively simple counting and analyzing the words used. However, it might lead to misinterpretation, as the same word can have different meanings (Berger et al., 2019 p. 10). Topic extraction identifies the underlying topics of discussions. In this context, a topic is defined as word distributions that often occur together (Berger et al., 2019 p. 11). This approach overcomes the limitations of word extraction but interpreting and identifying the number of topics remains challenging (Berger et al., 2019 p. 12). Furthermore, researchers must consider that review length is affected by factors other than consumer intent. Several platforms restrict text length by setting a character limit (Berger et al., 2019 p. 5). Studies also show that tweets posted from a smartphone are typically shorter and contain more emotional words than those posted from a computer (Melumad, Inman and Pham, 2019 p. 272).
Having reviewed various approaches to capture consumer opinion and considered potential biases affecting the data quality, this article ends by providing a comparison of the methods along their cost, objecting, reliability and validity.
Generally speaking, newer approaches are cheaper, faster and more easily accessible for those interested in conducting such research (Deutskens, de Ruyter and Wetzels, 2006, p. 346, Humphreys and Wang, 2017, p. 1274). While face-to-face interviews are most expensive, they can provide deeper insight by additionally capturing respondent behavior and body language. Telephone interviews tend to have higher response rates paired with lower costs than mail survey, but the interviewer can bias the results (Locander and Burton, 1976 p. 189). Instead, surveys usually reach a broader audience and provide respondents a sense of anonymity (Dickson and MaClachlan, 1996 p. 113). Obtaining UGC is comparably easy, but the filtering of relevant reviews from the large amount of data and subsequent analysis can be complex, costly, and time-consuming (Humphreys and Wang, 2017, p. 1274).
With the consideration of UGC, objectivity is given in a relatively straightforward manner since the content is created by users themselves and independent of those collecting the data. However, different interviewers can obtain different results due to the interpretation or social-desirability bias (Lovelock, Stiff, Cullwick and Kaufman, 1976 p. 362). Ideally, surveys written to investigate a certain question should ensure the same results, but instead studies find that the same survey designed by a different researcher can lead to differences influenced by different question wording or the layout (Gal and Rucker, 2011 p. 186).
The interviewer effect and human interaction can lead to changing results in the case of interviews conducted repeatedly under the same conditions. Since these effects are missing in written surveys, such approaches are likely more reliable. However, studies show that online opinions are characterized by sequential dynamics. The difficulty for users in identifying relevant reviews, coming to the purchase decision, and publishing a review increases with the number of reviews existing on the platform (Godes and Silva, 2012 p. 446). Hence, repeated data collection might yield different results and patterns of opinion development (Godes and Silva, 2012 p. 446).
Finally, all methods can suffer from validity issues. For traditional approaches, the misinterpretation of questions and social desirability bias can lead to invalid results, although researchers generally hold high control over the survey design (Gal and Rucker, 2011 p. 186). In the case of UGC, validity is especially problematic since structuring the huge amount of text data, filtering the relevant parts and interpreting the underlying context can be very challenging (Berger et al., 2019 p. 6). A critical researcher should always question whether the obtained data captures the originally designed research questions.
Summarizing, each approach to capture consumer opinion is associated with both challenges and benefits. While the general effort to understand the needs and preferences of consumers is important, carefully selecting the appropriate method to ensure unbiased inferences is equally crucial.