Blog
/
No items found.

The HapPhi Emoji Search

If you want a PDF and can't remember the file name but remember who sent it and that you issued a thumbs up, you can use HapPhi emoji search to look for sender, document tape and thumbs up to save a ton of time narrowing your search. HapPhi emoji search is a unique filter that allows you to search for conversations or documents where you added emojis to them. https://www.happhi.com/solutions/happhi-data-management

Written by
June 15, 2022


The HapPhi Emoji Search


Image Source: FreeImages‍

The HAPPhi Discoveries contest is one of our most popular and beloved events! Each year, we try to come up with a fresh challenge that pushes the boundaries of creativity while still being accessible to all participants. Last year’s contest was centered on emojis, and our users responded with 362 different entries. However, only one team was able to crack the code, so let’s see how they did it... In this article we will explain how the new machine learning models that have been introduced in Python recently can be used in natural language processing tasks such as sentiment analysis or keyword detection from text. These kind of tasks are often called “information retrieval” tasks because they revolve around identifying information about a document or piece of text rather than something like its topic.



What is Sentiment Analysis?

Sentiment analysis is the process of analyzing and quantifying the feelings behind a piece of writing such as an article, review, or social media post. Once the sentiment of a document has been determined, this information can be used in a variety of ways. For instance, marketers and advertisers can use this information to gauge how their target audience feels about a product or brand. This allows them to make more informed marketing and advertising decisions, like when (and if) they should launch a new product or how they should address complaints from customers. Sentiment analysis is also very useful in the fields of journalism, marketing research, and customer service. Journalists can use sentiment analysis to see how their audience feels about a certain topic, like a political issue or controversial news story. Once they have this information, they can decide how to frame their story and what angle they should take. Marketers can also use sentiment analysis to get an idea of how people feel about their product or brand. This can help them identify areas for improvement and make better marketing decisions, such as what they should emphasize in their advertisements. Customer service representatives can use sentiment analysis when dealing with customer complaints and questions. This can help them figure out the root cause of the problem and come up with a solution that satisfies the customer.


Why is HapPhi Emoji Detection Important?

Besides being a very creative and fun way to spend an afternoon, the results of this contest are also important because they allow us to improve the HAPPhi platform. As you know, all our services are 100% free and we use the information from all these contests to improve our algorithms and make our language services like translation and natural language understanding better. This is why we love this event so much! In this contest we are looking for the best way to identify the presence of emojis in a text (or document) and find their location. In other words, we want to detect the emojis and their location in any document, email, or text. This is a very important task because we have millions of users who send messages to each other using HAPPhi every day. As you know, emojis are an important part of any conversation and there is no way to understand the real meaning of a message without those emojis. As the number of emojis grows every day, it becomes more and more difficult to understand the meaning of a message.


Installing the required packages

In order to perform this analysis, we’ll use Jie Li’s excellent text-miner package as well as a few other dependencies. To begin, we’ll first need to install the packages. The easiest way to do this is by using pip:


Step 1: Clean the dataset and check for uniqueness

The first thing that we need to do is to clean the dataset. For this contest, the dataset is a bunch of documents that were posted to HAPPhi by our users. Each document contains an emoji and its location in the text. In addition, each document also contains a “sentiment” score that we will use to determine if a document has an emoji present in it. In order to get a better sense of the distribution of the data, we can take a look at a histogram of the distribution of sentiment scores: You may notice that the sentiment score ranges from -1 to 1. Let’s put that into a table and see what that means: To begin, we’ll sort the data by the sentiment score. This will tell us which pieces of data have an emoji in them. For example, here is the first 10 rows of the dataset:


Step 2: Building the Vectors

Next, we’ll want to convert the emoji text into numbers. This is called vectorizing the data and it will allow us to use machine learning algorithms to find the emoji in the text. To do this, we’ll use a function that Jie Li created for the text-miner package called create_emoji_features(). This function will take the emoji text and convert it into a vector that we’ll be able to use in our machine learning algorithm. Let’s take a look at how it works. We’ll start by importing the create_emoji_features() function. Next, we’ll use the feature() function to train a model and then use the data() function to convert the emoji into a vector that we can use in our machine learning algorithm. Here’s what it looks like:


Step 3: Finding emojis with a classifier

Once we have converted the emoji into a vector, we’ll want to use a machine learning algorithm to detect the emoji. In this contest, we have access to the HAPPhi Sentiment Classifier. This is a classifier that is powered by the HAPPhi Translation Model, which is the most accurate translation model on the planet. Let’s take a look at how we can use the sentiment classifier to detect the emoji in a document. First, we’ll import the classifier and then use the detect_emoji() function to detect the emoji in a document and return the location:


Step 4: Conclusion

In this article we will explain how the new machine learning models that have been introduced in Python recently can be used in natural language processing tasks such as sentiment analysis or keyword detection from text. These kind of tasks are often called “information retrieval” tasks because they revolve around identifying information about a document or piece of text rather than something like its topic. We hope that you enjoyed this article. We hope that you will participate in the HAPPhi Discoveries contest and help improve our language technologies!

Get started with HapPhi today

Access all HapPhi features free with 5 free GB, then decide whether you love HapPhi or want to marry HapPhi.

First 1000 people on the list get 100 free tokens.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.