How to Clean and Prepare Open-ended Feedback Data

How to Clean and Prepare Open-ended Feedback Data

Open-ended responses are a crucial part of customer feedback surveys. Analyzing the responses can be costly and time-consuming, but the payoffs are clear. The responses are more detailed and often more valuable than closed-ended responses, and they can help you gain an in-depth understanding of your customers. Here are five ways to clean and prepare your data for analysis.

Coding feedback

The most effective way to analyze open-ended customer feedback survey data is to code your responses manually. This involves combing through your feedback data and categorizing each response. For example, you could have a category for “price,” and any comment that mentions the cost of your product would fall under that category. The process can be costly and time-consuming, but there are obvious benefits to having a person categorize your feedback data. An algorithm can only identify keywords, and so it is susceptible to misinterpreting feedback.

Removing spam and nonsensical responses

Unfortunately, not all customer feedback is useful. Some will be nonsensical, either intentionally or unintentionally, and others will be completely unrelated to your business or product. Removing these responses will improve your analysis by stripping a lot of the noise from your data. Text analysis algorithms are especially vulnerable if there is a lot of irrelevant data. Sentiment analysis algorithms will assign values to junk responses, and topic modeling algorithms will identify keywords from nonsensical comments.

Stripping HTML and other formatting issues

Most surveying tools will format your survey responses correctly, and most analysis software will be smart enough to fix most formatting issues when you import your data. But it’s always good to make sure your data looks the way it should before beginning any analysis. HTML tags can sometimes slip through as raw text, and certain characters can be encoded in unpredictable ways.

Learn how to analyze open-ended survey responses in Displayr

Checking for spelling errors

This is a simple but important task. If you want to find the number of customers who are interested in the price of your product, you could target words like “price”, “cost”, and “affordable.” However, you would miss responses that misspell any of those words. Most programs have a built-in spell-checker, so this process doesn’t have to take up a lot of time. You could have a text-processing program automatically make the changes for you, or quickly do it yourself.

Making a note of jokes, sarcasm, and idioms

If you are planning to run text mining techniques on your open-ended responses, be aware that the algorithms are not sophisticated enough to detect sarcasm, humor, and figures of speech. Sentiment analysis algorithms will mislabel feedback responses like, “This program is as useful as an umbrella in a tsunami.” Customers leave feedback with the assumption that another person will read it.

Even if you are not planning to use algorithms to analyze your responses, it’s still a good idea to leave a note alongside feedback that can be misinterpreted. That way, if someone is unfamiliar with a particular turn of phrase or misses a joke, they can still understand what the customer has said.

About Kris Tonthat

Kris is a writer and editor at Displayr. He is also a former sportswriter and a recovering economics graduate. Despite all his writing experience, he still struggles to craft a decent profile bio.