Multilingual Sentiment Analysis: How to Do It Right
Sentiment analysis – the identification, extraction, analysis, and labelling of consumers’ feelings and opinions on social media – is becoming an increasingly important part of social listening because of the growing amount of people actively using social media – 4.48 billion people to be precise.
When you consider that only 25.9% of internet users speak English natively, sentiment analysis in languages other than English becomes increasingly important. The diversity of languages and cultures on the world wide web strongly influences social analytics and social listening. That’s why sentiment analysis in English alone isn’t enough.
By acknowledging that sentiment is inherently linked to language and culture, multilingual sentiment analysis enables social analytics and social listening in different languages, ensuring that companies are able to break down language barriers and catch valuable insights in real-time.
So, let’s review how sentiment analysis works and how to analyse sentiment in a multilingual environment.
What is sentiment analysis?
Sentiment analysis, also called opinion mining, is a subset of social listening. Social listening is the monitoring and analysis of conversations across social media to identify sentiment, causes, public opinion, and social trends.
For example, a company may use social listening to conduct analyses of content trends. Artificial intelligence models are trained to monitor thousands of conversations at once and tune themselves to the different topics discussed on social media.
When the variable that wants to be identified is sentiments, these can be about a company, product, or service; an industry; a celebrity; brands; or even politics.
Why is sentiment analysis important?
Sentiment analysis is important because sentiment drives action.
If people love a product, they’re going to keep buying it, and they might even tell their friends and family about it. If people love the ideas of a politician or hate a public figure, they might elect them into office or discredit them completely. If public sentiment related to a future demonstration or parade indicates potential violence, governments might take precautions to avoid the sentiment from becoming reality.
How does sentiment analysis apply to marketing?
In marketing, sentiment analysis allows companies to see how customers feel about their products and services. This information can be used to improve offerings, fine-tune marketing campaigns, and reach out to audiences who might not know about the brand yet. This makes sentiment analysis a fabulous tool for:
- Reputation management
- Consumer behaviour analysis
- Brand management
- Brand awareness
- Competitive intelligence
- Customer service
For example, if a brand finds that the sentiment around their latest product launch is negative, they can take a step back and see what went wrong. Is it the price point? The colour combination? The ingredient list? With sentiment information, companies can develop better products for their customers.
In customer service, as another example, sentiment analysis provides insights into how people feel about the customer support of a company. If sentiment is low, companies can use that information to improve their support. If sentiment is high, they can thank the customers for spreading the word and consider making sentiment information public with a testimonial campaign.
Why is multilingual sentiment analysis essential?
Multilingual sentiment analysis means performing sentiment scoring in more than one language. The tricky thing about sentiments is that our emotions and consumer behaviour are greatly influenced by our culture and language. It’s a bit like SEO and SEO translation: if you don’t have a good grasp of the user’s cultural context, your efforts are likely to fail.
Therefore, for organisations with an international customer or user base, sentiment analysis cannot be an English-only game. The sentiment of your customers from Portugal, for instance, will require analysis in Portuguese if you want to avoid sentiment inaccuracies and misinterpretations.
How to do sentiment analysis
Analyzing sentiment (in one language or across several languages) is a task that requires the use of machine learning (trained models), data analysis techniques, and natural language processing (NLP) to derive quantitative sentiment scores from raw text. There are several techniques to carry this out.
While a person could do the job manually by browsing the web, finding relevant posts, reading them, and assessing the sentiment or emotion behind them, in practical terms, an algorithm will be able to do the job much faster and more accurately.
When sentiment scoring is done by machine learning, sentiment models learn from data. They start with training samples, which are texts marked up with sentiment annotation so that they can be used to train sentiment analysis algorithms. These samples are extracted and annotated by human experts; in the case of multilingual sentiment analysis, you’d need a marketing translation specialist.
The models use this training sample data to find patterns, identify associations between sentiment and sentiment-carrying words, and create sentiment scores.
Sentiment analysis algorithms and models
Sentiment analysis systems can be rule-based, automatic, or hybrid.
- Rule-based: These algorithms perform the analysis by applying rules programmed by experts. For example, sentiment analysis rules might look for specific words or phrases that indicate sentiment, such as “great”, “outstanding”, and “terrible”. An example is the VADER sentiment analysis model, which stands for Valence Aware Dictionary and sEntiment Reasoner).
- Automatic: these models perform sentiment analysis without human interaction through machine learning and training samples to infer sentiment scores.
- Hybrid: These sentiment scoring systems combine both sentiment scoring approaches.
Multilingual sentiment analysis: does sentiment analysis work in all languages?
In sentiment analysis, language detection is a prerequisite to sentiment interpretation. Without identifying the language of a social media post, an algorithm will not be able to decipher sentiment information.
This ability, however, needs to be built into the sentiment analysis model; it’s just not feasible to involve human linguists to translate into English the whole pool of foreign-language social media posts that require sentiment scoring.
This leads us to one of the most frequent challenges in sentiment analysis for companies operating internationally: their sentiment scoring systems aren’t trained in sentiment analysis for other languages than English.
That said, it’s fair to point that multilingual sentiment analysis involves a lot of resources, and it’s still difficult to find sentiment tools that actually do sentiment analysis of social media posts in more than one language.
What about machine translation?
Is machine translation good for multilingual sentiment analysis? Yes and no.
While neural machine translation has come a long way and produces relatively accurate translations, sentiment and emotion cannot be fully addressed by machine translation.
Machine translation is, for example, unable to detect sarcasm or irony. Because sentiment analysis requires understanding the text in context and sentiment cannot be seen as separate from context, this task is still better performed by sentiment scoring models trained in sentiment analysis for each language.
However, if the analysis only seeks some basic sentiment information (positive vs negative), sentiment analysis tools that use machine translation can learn sentiment from translated text quite effectively.
The ideal scenario would involve input from sentiment experts who can train sentiment models for sentiment scoring in each language.
Types of sentiment analysis
To establish the different types of sentiment analysis, we first need to establish the criteria on which the classification is made.
Monolingual sentiment analysis vs multilingual sentiment analysis
The first distinction that can be made in sentiment analysis is whether it’s conducted in one language (monolingual sentiment analysis) or more (multilingual sentiment analysis).
As sentiment is affected by language, it’s important to be able to conduct sentiment analysis in the native language of your target audience. In the section below, we’ll look at how this is done to ensure maximum sentiment accuracy.
Polarity-based sentiment analysis vs advanced sentiment analysis
The third distinction concerns how advanced the analysis is.
Polarity-based sentiment analysis is the most basic form: using natural language processing (NLP), sentiment is classified as positive, negative or neutral by looking at single words.
Advanced sentiment analysis is more complex and uses advanced linguistic methods, such as syntax and semantics, to take into account other factors such as:
- The strength of the sentiment: Polarity categories are expanded (very positive, positive, neutral, negative, very negative). This is called fine-grained sentiment analysis.
- Emotion detection: The analysis goes beyond the positive/negative dichotomy to look at emotional states such as enjoyment, happiness, frustration, anger, surprise, disgust, etc.
- Context: The analysis takes into account sentiment in other parts of the text, such as whether sentiment is strong or weak elsewhere, what sentiment words are nearby, etc. This is useful to identify sarcasm and avoid false sentiment results.
- The time or place when the sentiment was expressed: sentiment changes depending on the time and situation, such as sentiment expressed during a national tragedy or after an election.
General sentiment analysis vs aspect-based sentiment analysis
The last distinction to be made is whether sentiment analysis relates to an entity (product, event, person, etc.) as a whole or focuses on particular features or aspects of such entity.
For example, a tweet can simultaneously refer to a hotel’s excellent location and mediocre food: sentiment analysis would be looking at the sentiment for each of those aspects separately.
Final remarks about multilingual sentiment analysis
Multilingual sentiment analysis is not a topic to be taken lightly. The reason is pretty simple: language matters. Not only are some languages spoken more frequently on social media than others, but sentiment itself is culturally unique.
In order to conduct accurate multilingual sentiment analysis, you need tools that have been designed for sentiment analysis in more than one language, as well as the support of specialist marketing translators that can train sentiment scoring models specific to each language.
If you’re looking for professional support in multilingual sentiment analysis, get in touch with Crisol Translation Services!