December 2009
Practitioners' Place |
|
Want more actionable segments? Maybe try Max-Diff. In this recently published Quirk's article TRC's Chief Research Officer Rajan Sambandam explores the use of Max-Diff to achieve greater discrimination in your data and, ultimately, to develop better and more useful segment solutions.
Significant differences are not always what they seem. We all run cross-tabs and t-tests searching for segments that stand out from the pack. But what looks to be significant in isolation may not be once you control for other more important factors, and this reality can make a big difference to researchers looking for real insights. Read more about how to avoid "significant mistakes" in our most recent TRC Market Research white paper, Beyond Crosstabs: The Usefulness of Covariate Analysis. |
| R² (Our CEO, Rich Raquet, on Research) |
|
Are Researchers Too Ethical? Seems like a silly question. But are our efforts to build a wall around our industry, and to keep out the ideas and practices of Telemarketers at all costs, always serving us or respondents well? Read Rich's thoughts on the matter here. |
| Insightology |
|
Size matters: reflections on our (in)significance. No grand research lessons here - just a couple of nifty blog posts that put things in perspective. What exactly does "big" mean? What are we - big or small? Check out these and other Insightology posts, including a very cool data visualization device. |
Beyond Crosstabs: The Usefulness of Covariate Analysis
|
Consider a very common occurrence in marketing research: the need to tell if two groups are different on a given measure. Examples could be differences between men and women in an awareness study, or current and former customers in a satisfaction study. The usual approach in such situations is to use cross-tabulation where the groups in question are positioned as banner points and the measure is set up as a stub and the objective is to see if there are statistically significant differences between the two groups on the given measure. If there is, the groups are then deemed to be different from each other on that measure. This is a commonly used procedure and, in fact, many cross tabulations are run after research studies to identify if differences exist between specified groups. But could there be a problem with the conclusion that the two groups are different? Are there situations where the cross tabulations could show a difference that did not exist in reality Yes, and in this article we will look at one major issue and the solution.
Imagine that a company is conducting a satisfaction study and is interested in understanding whether satisfaction varies by ethnicity. Let’s say it is a financial services company that has branches in different parts of the country, some of which are more likely to serve customers of certain ethnicities. Hence, understanding the relationship between satisfaction and ethnicity is important to the company. Let’s also say that the company is particularly concerned about Hispanic customers because it sees them as a rapidly growing market and one that is potentially vulnerable to low satisfaction scores. The common approach here would be to run cross tabulations to identify if differences exist between Hispanic and non-Hispanic customers on satisfaction. Let’s say the company did run this analysis and it showed a significant difference between the satisfaction scores of these two groups. If the analysis stopped there the recommendation to the marketing team may have been to explore further the reasons for this discrepancy and perhaps to spend considerable resources trying to address the problem.
Bivariate and Multivariate Analysis
But was there really a difference between the two groups? This is where multivariate analysis comes into play. To understand it, let’s first take a step back and consider an old but reliable technique: multiple regression analysis. This technique is normally used in situations where one needs to identify the key drivers of a given target variable, say, overall satisfaction. The general approach would be to run a regression model with overall satisfaction as the outcome (or dependent) variable and a group of attribute satisfaction questions as predictor (or independent) variables. While most or all of those attributes may have a positive and significant correlation with overall satisfaction, regression works in such a way that it identifies the individual, adjusted impact of each predictor variable. Let’s consider this in a bit more detail.
Let’s say there are two variables A and B that have large, positive and significant correlations with overall satisfaction (see Figure 1). Let’s also say that A and B are largely independent of each other, that is, they have very low correlation with each other. The regression works in such a way that, more often than not, both A and B will show up as key drivers of overall satisfaction, . That is because they are independently contributing to changes in overall satisfaction. To put it another way, a change in A could influence overall satisfaction but not affect B, while a change in B could influence overall satisfaction without affecting A. So it is reasonable that they should have independent impacts on overall satisfaction with neither being overly affected by the other.
Now consider two more variables C and D that also have large, positive and significant correlation with overall satisfaction (see figure 2). Let’s also say that C and D have a strong positive correlation with each other, that is, they are not independent. More likely than not, one of these variables will show up as a strong driver of overall satisfaction, while the other may not show up as a driver at all. Again this is reasonable since having one variable is sufficient to explain changes in overall satisfaction and having both would be over-counting the impact of the construct that that the two variables jointly represent.
So, what we see here is that multiple regression does a nice job of looking at the relationship between predictor variables and telling us which are the ones that truly drive the outcome variable. It is able to do this because it calculates the impact of each variable when controlling for the impact of the other variables. Hence the impact of A is the unique impact of A alone holding all other variables constant. Similarly for B, C and the other variables. Therefore, this is known as an additive model because the impact of each variable can be added to the others to maximize the total impact on overall satisfaction. If one wanted to improve overall satisfaction, one could make efforts to influence say, A and B and know that both will have an impact.
Control
The idea of control is what is important for our situation. Because regression is able to control for the impact of other variables and identify the true impact of a given variable, it’s a useful tool to help understand if what we saw in the cross tabulations was real or not. Imagine that in our problem we decided to run a multiple regression to understand the problem more clearly. The outcome variable would be overall satisfaction. If Hispanic or not were to be used as the only predictor variable it would come out as a significant key driver, just as seen in the cross tabulation. Now let’s say we decide to add other variables (generally called as covariates) to the regression model. The covariates added here represent the states where the data were collected. Now we have a regression model with overall satisfaction as the outcome variable, and predictor variables comprising ethnicity and geography. Now let’s say the ethnicity variable is no longer a significant key driver. What’s the explanation?
By including states as additional predictor variables or covariates, we are controlling for state-based differences in overall satisfaction. If it so happens that there are profound differences between the states on overall satisfaction, depending on how the patterns fall out it may cancel out the impact of the differences between Hispanic and non-Hispanic customers.
This, in fact, is a real example. In doing this analysis, we did find that the impact of ethnicity disappeared when controlling for geography, and there was a reason for that. One of the state-based independent variables (Florida) came out as a strong key driver. That is, respondents from Florida were generally very dissatisfied with the company compared to those in other states. The company was in the insurance business and hence the result was understandable as Florida is in the hurricane zone. Since most of the Hispanic respondents in the study happened to be located in Florida, it appeared that Hispanics were less satisfied than other ethnic groups. Since cross tabular analysis is bivariate in nature (two variables at a time), what else is it going to show?
What we have done here is apply experimental principles to survey research. Commonly in experiments, every variable is controlled except for one and that provides information on what truly influences what. So, if one wanted to know if an attractive face has an impact on the take rates of a banking brochure, then one could create two brochures that are identical in all respects expect for the presence of an attractive face in one. If that brochure has a higher take rate, then the attractive face must be the reason (this is also a real example). In survey research, by contrast, there are plenty of uncontrolled variables and we have a hard time asserting that X causes Y. One way to eliminate a good level of uncertainty is to conduct the covariate analysis outlined above, where we seek to control for as many variables as possible in order to identify the true impact of the variable in question.
In what situations can one apply this analysis? In pretty much any case where previously a cross tabulation would have been used to form conclusions about something. Studies for understanding take-rates, satisfaction, awareness, ad effectiveness etc are all good candidates for this kind of analysis. |
Are Researchers Too Ethical?
Got an interesting question in my Linkedin morning update about the ethics of Market Researchers doing Market Intelligence work. While the question was vague enough to be unanswerable (what sort of Market Intelligence are you talking about?), it got me thinking about ethics. Specifically, I’ve been thinking that researchers are too often focused on strict ethical rules rather than on doing the right thing.
So, right off, let me state that I totally believe in obeying laws, regulations and, yes, ethics. This extends to our dealings with clients, vendors and, most importantly respondents. I wouldn’t want to work in an industry that doesn’t take ethical responsibility seriously. I'm concerned, however, that we don't apply ethical standards intelligently. This, in turn, works counter to the principles our ethics claim to protect and harms our effectiveness as an industry.
Telemarketers: Our Nemesis?
For example, researchers talk a lot about how important respondents are. When I was starting out in the 80's we spoke with contempt about telemarketers and their wanton disregard for the rights of those they were calling. We rightly pushed for legislation that outlawed SUGGING, and held ourselves to a standard of confidentiality that ensured our work would never be used as a direct sales tool.
I’ve come to believe that most of these beliefs or assumptions were incorrect.
In the 90’s I got to know a number of telemarketers, and I found the experience was far from what I expected. Their call centers were as nice, or nicer, than those used in research. They screened and trained their employees, thoroughly and held them to a very high professional standard. Most shocking to me was the fact that they recorded all their sales on tapes (they had rooms full of them). I realized they validated their sales better than we did our interviews. As a result, TRC started recording all our work (couldn’t allow telemarketers to do something better than us), a practice that is more common now, but 15+ years later still far from universal.
I’m sure there are unethical telemarketers…but there are plenty of unethical researchers too. Meanwhile our contempt for all things telemarketing, has led to an even greater focus on never using survey efforts to directly sell to respondents. On the surface, this makes sense…valuing respondents means keeping our word and respecting in any form their privacy. Often, however, it gets in the way of doing what respondents want.
Two examples
When I was an interviewer in college, the firm I worked for did a product placement for Post It Notes. As you might imagine, this new and innovative product was garnering rave reviews from those we called. Many asked us if they could order more…even if they had to pay for them. Lots of discussion ensued before we were provided with a name and number we could give consumers who wanted to call and order more. Respondents didn't understand. Why didn't we just have the company call them? Even with this compromise, many old school researchers at the firm thought we were “selling out.”
Within the last decade I attended an annual CASRO meeting where a CRM firm talked about how they took negative responses from survey cards (the ones left in, say, a hotel room) and passed them on to the people responsible for the problems cited. In turn, those people would follow up with the customer to try and fix their problems. Researchers were appalled at this violation of respondent privacy, but then the CRM rep asked a few simple questions, "What would a respondent expect? Will he or she be more upset if a complaint they've made (and signed their name to) is addressed directly, or if he or she never hears from the company about the problem?" In other words and in both cases, strict adherence to firm ethical rules would not have served the respondent well.
Acting ethically, and wisely
Ultimately shouldn’t our goal be to do what respondents want? If so, then directly passing on information is often the right thing to do. I believe this should be limited to situations where the respondent agrees to it, and I don’t think researchers should ever go directly from conducting a survey to selling (thats both illegal and plain wrong).There's no question that replacing firm rules with the principle of putting the respondent first could lead to abuse….but those who would rationalize their way around this principle are probably the same ones who find a way to wiggle through the rules. I’d go so far as to suggest that if we all started to put the needs of the respondent first, we’d be holding ourselves to a higher standard today than we do now.
Think I’m wrong? Then answer this question: Do you think respondents want to do 25-minute repetitive surveys? |
|
Copyright © 2013 TRC.com All Rights Reserved.
|
|
|