We partnered with a well-known tech university from the United States to build a fully automated, NLP-powered social media monitoring tool for sentiment analysis.
Codete's researchers are involved in both internal and external projects. We cooperate with leading technology universities in the world in the area of Research & Development (R&D). Our work is mostly focused on (but not limited to) data science, image and video processing, as well as NLP.
In this project, we used NLP to create a tool for monitoring social media content and analyzing sentiment in the posts.
Challenge: Real-time sentiment analysis with machine learning
The growth of social media usage changed the world of marketing. It has never been easier to deliver content than it is now. Any information can spread across the globe in less than a second, so it is critical to recognize what is happening with your product or brand as soon as possible.
It is especially important if something goes wrong or if you want to see how other people perceive what you do. That is why we believe that having continuous and fully automated social media monitoring tool is a must-have for all businesses, and sentiment is one of the most important factors to monitor.
Solution: Custom NLP solution for Twitter
Our partner's CoreNLP is a cutting-edge NLP library that we intended to use for our content monitoring tool, but its accuracy was only about 50%, which fell short of our expectations. The tool was intended to work with longer, usually grammatically correct texts, but because Twitter limits message length and the platform's users don't care that much about linguistic correctness, the library's ability to work properly with the texts we had was unsatisfactory.
We created our own solution by combining the TFIDF vectorization method, PCA, and Random Forest classifiers. We collected publicly available datasets of tweets labeled with their sentiment for the training phase. As a result, our method achieved over than 75% accuracy.