Is machine learning getting to grip with sarcasm? Sentiment and Semantics

Focus Groups, Market Research, Qualitative Research

semanticblog

It’s no secret that qualitative data is harder to analyze than quantitative. The replacement of ticked boxes for a plethora of data from verbatim, videos, group conversations and observations means that participant answers are vastly more time consuming to categorize. Not only this, but the context and motivations of the individuals need to be considered in addition to their responses and behaviors. But the effort is worth it; you can work to really understand consumer behavior and get to the ‘why’ behind the ‘what’.

Of the many techniques used to analyze qualitative information, sentiment analysis and semantic analysis are two we often come across. These provide insight into the data by looking at attitudes and themes. And as new technology continues to make its way into qualitative analysis, we need to understand how it can impact these two methods of analysis, and how it can help develop more meaningful insight.

As a reminder, sentiment analysis categorizes language based on opinions, often checking for a positive or negative viewpoint. This can include, for example, the views of consumers expressed on social media or online communities toward a specific product, idea or service. Although this deep analysis can be done by humans, it is quicker using technology – most often machine learning. Machine learning is helpful in analyzing large quantities of information and making judgements based on language that would be far too time consuming for researchers.

However, it’s important that sentiment analysis is not entirely reliant on technology. We are often asked to analyze data in a binary way, but the expressed sentiment is rarely that black and white. Technology often makes assumptions about the sentiment of a response to make it fit a binary format, while humans would be more cautious to force content into a category. Moreover, while humans can contextualize written language and understand colloquialisms and contradictions, as well as turns of phrase and humor such as sarcasm, machines struggle with this. Machines simply cannot understand all the nuances of language and its grammatical patterns to the same accuracy that a human can. In spite of the claims of some of the technology providers, we are far off being in a place where text can be analyzed by a machine as well as by a human. It is therefore important to use a joint approach to sentiment analysis; machines may make things quicker, but they’re also more likely to create inaccurate data through generalization, lack of contextual knowledge and a lack of natural language skills.

Semantic analysis, on the other hand, generates meaning by grouping together the themes in qualitative data. It is often referred to as topic or keyword analysis. This is an important approach when trying to understand the important subjects in the field of study, regardless of how a participant feels about them. This is often useful in the early stages of a research project, when exploring a broad idea and looking into other relevant and more specific ideas that need to be examined further. As with sentiment analysis, this can easily be done by a computer, and indeed, it is far more efficient to do so, leaving the researcher with more time for deeper analytical thinking. Furthermore, a machine learning process can take place with no prior knowledge of the documents – again saving time. However, there are limitations, as the grouping of themes needs to go beyond basic classification to give a meaningful representation of the qualitative information.

So machine learning speeds up sentiment and semantic analysis; for text based analyses this can open up a whole new world of information about consumers from social media, communities, and other online resources. The tech guys are working hard to solve the nuances of language such as sarcasm – machine learning’s biggest barrier in sentiment analysis. And then with semantic analysis comes the challenge of context – such as the ability to distinguish Mars the chocolate bar from Mars the planet. Technology is certainly getting closer to solving these problems, but in the meantime, it’s nice to keep the humans around.

 For more on qualitative analysis, download our white paper here.