Sam Winstanley’s Blog

Market Research Technology News and Views

Sam Winstanley’s Blog header image 2

Teaching computers to think backwards….

September 15th, 2009 · 2 Comments

Did you ever stop to think what’s really making survey data unique…? I talk with database people often, some of them work on enormous ultra important data-warehousing systems for banks. They pick on me, they say:

“survey data is simple, you should try building an analysis database from our customer records…my life is hell”

I shrug this stuff off, its easy to pick on the little guy who isn’t building the systems that have 7 figure budgets and 100 strong programming teams.

Thinking Backwards…

Researcher’s actually think backwards, they know they are digging for information about how things perform against each-other, they come up with the instrument to measure this data, possibly in terms of questions and answers, possibly in terms of qualitative analysis, web mining etc.. however the start is the business problem. Often this transpires into some kind of interview script which has modules and questions and sample variables etc… what just happened was that something very business specific into something deeply generic (questions and answers). This is what I call “downgrading”.

For years and years software vendors have been working very hard to build systems for collecting and analysing data for surveys, and all of them have focussed on ease of creating surveys. In other words all of them help convert business minded ideas into generic questions and answers.

Ease of Authoring Might be a BAD thing..

You see the truth is that ease of authoring is a pseudonym for data-hell, it means that a relative novice can create an immense survey with no thought whatsoever to the real structure of the data that’s coming out.. an entire DP industry and countless DP tools have been founded on creating order from the chaos that is surveys.

We Revolve Around Cross-Tab Reports

I refuse to do an extensive industry study to establish how many cross tabulation tools exist for market research vs. the rest of industry. Whatever the hard numbers I’m certain its extremely disproportionate to the market size.

It’s clear why the market research and survey research industry loves this kind of technology so much though. It’s allowing primitive information like questions and answers to be “upgraded” back into a business context so that it can be analysed.

So are cross tabs a bad thing?

Cross tabs have a role, they are an analysis tool more than a reporting tool, its a fairly neat way to compare X with Y in quite an ad-hoc fashion it enables exploration of results.

However cross tabs bring baggage with them when you use them as a reporting platform in a couple of ways:

  • Cross tab reports are nearly always showing a single source of data and that data is questionnaire structured not business structured. This means every piece of information you put into this data source is getting “downgraded” and its loosing richness and its context.
  • Cross tabs naturally limit analysis, we “the techies who provide to the MR industry” try to force statistical models into these reports in a very un-natural way… for instance nobody else, anywhere outside surveys has really made use of a Column Proportions test which compares column A with B/C/D/E, that doesn’t mean that nobody else in the world ever does that kind of analysis it just means is a pretty absurd way to implement the result of a statistical test.

Preaching to the converted

Much of the survey research world has moved on from cross tabs at least at some level, and I’m not out to teach people to suck eggs. But I see so much evidence that traditional DP and interactive tab  packages still provide lots of underpinnings for the industry.

The industry is bound in chains but contains many smart people who do understand that if you think up-front about the way to arrange data, you can provide a much more valuable analysis and reporting experience on the back of it. What we lack is the 7 figure budgets that make it so easy for banks and blue-chips to roll elegant warehouse systems and dynamic business dashboards on the back of it.

Back to You Mr Database Guru

At the start of the post, I stated my poor treatment by my peers in the banking world. The truth is they live in a world of luxury, if they thought they had a hard life now whatever would they do if 100’s of people in the organisation were able to add a new data field into a customer records.. but that’s virtually the same thing as adding a new question to a questionnaire. Seriously, those guys would suck at surveys in every way…

At forgetdata we spend a lot more time thinking about and designing the strategy for managing and storing survey data than most people. We understand that perfect standardisation of questionnaires isn’t a perfect world its a rare luxury and that even 6 figure data architecture budgets are even rarer.

Our passion is helping the industry solve this kind of problem without having to re-invent every wheel its based on, a sensible business based highly controlled kind of chaos.

Tags: Data Warehouse · Interviewing · Market Research Industry · Tabulation

2 responses so far ↓

  • 1 Grant // Sep 17, 2009 at 10:12 pm

    Interesting comments, but cross-tabs can combine data from multiple sources, and don’t have to be ‘techo’.

    I am aware, having worked on some ;-), that some cross-tab programs can achieve a good balance of reporting and analysis.

    I think other industries could learn from MR and apply statistical tests in cross-tab like programs; I see other industries like your banking industry example making the same mistakes such as dashboards that don’t apply statistical tests (thus show trends where there may just be noise) and don’t allow analysis.

  • 2 sam // Sep 18, 2009 at 5:52 am

    Hi Grant,

    You make some excellent points… very much I believe that dashboards are a reporting media rather than than an analysis media. I don’t really agree that they can’t contain statistical tests, sometimes the very essence of what a dashboard is reporting is a statistically derived score. More specifically most banking dashboards include scores that are derived from models whether they be “Customer Loyalty” “Customer Potential Value” etc.. when the analyst or data-miner was working out the right models to use to generate those scores its very possible that cross-tabs were one of the ways that the data was explored (although its equally likely that they used some graphical representation which correlates predicted and actual (historic) values).

    In terms of your comments more centered on cross tables, they are decent exploratory analysis tools in some cases. However for visualising results of stat tests I think they do a bad job of showing a picture (even to an analyst) for instance, I think that a visual diagrams (web diagrams or heat maps or even regular histograms) are generally a better way to show the output of most of the statistical tests used on research data than a cross-tab which is augmented with stats scores. After all cross tab’s are showing facts/measurements and people are looking for actions and insights, cross tabs have quite a small part to play in that overall picture in my eyes.

Leave a Comment