Using AI to Identify Fake Social Media Accounts

Published date: 10.01.2023

Read time: 6 min

Fake accounts on social media are a big issue since they erode trust between users and the platform. Since the top social media platforms like Facebook, Twitter, LinkedIn, and others have millions of accounts, it is not possible to manually sift through all of these accounts to identify fake ones. This is why social media companies are leveraging AI to help them with social media moderation to flag and remove fake accounts. In this article, we will talk about why fake social media accounts are so problematic, the AI technology used to identify fake accounts, and the types of data annotation used to train AI systems to identify them.

Why are Fake Accounts Dangerous for Users?

One of the biggest reasons why fake accounts are so dangerous is because they are often used by criminals as a launchpad for fraudulent activity. For example, a fraudster created a fake account to pose as Keanu Reeves and demanded a staggering 400,000 USD from an innocent victim in California. Fake accounts present fraudsters with a treasure trove of opportunities. They can be used to conduct promo abuse, payment fraud, and identity theft, just to name a few. They are also instrumental in carrying out content spam, ruining the experience for other users on the platform.

In addition to this, companies might use fake accounts to skew marketing analytics. Online advertisers are contractually bound to deliver a certain number of clicks and impressions to their customers’ ads. If they realize they’re not going to reach their target, they will use bots and fake accounts to click on advertisements and reach the guaranteed number of clicks. This results in invalid traffic, giving businesses inaccurate customer acquisition and advertisement performance data.

Finally, fake accounts pose big problems for social media companies themselves. For example, the majority of Facebook’s income comes from adverts. The revenue growth may remain steady as long as the advertisements are viewed, clicked, shared, or liked. If a high percentage of these ads are being viewed by fake accounts, this makes Facebook a less desirable platform to place advertisements, which will result in revenue loss. Therefore, it’s in everybody’s best interest to spot and remove fake accounts.

How Can AI Help Flag Fake Accounts?

AI is a very effective tool for content moderation and identifying fake accounts, but it needs to be well-trained in order to be effective. The training is done through data annotation, and we will go into greater detail about this in the next section, but first, let’s take a look at the training datasets that are used. Basically, researchers try to identify patterns or differences between human users and bots. A great example of this is research done by Emilio Ferrara at the University of Southern California and his colleagues, who trained an AI to detect bots on Twitter based on differences in patterns of activity between real and fake accounts. The team analyzed two separate data sets of Twitter users, which had been classified either manually or by a pre-existing algorithm as either bot or human.

The manually verified data set consisted of 8.4 million tweets from 3500 human accounts and 3.4 million tweets from 5000 bots. The researchers found that human users replied between four and five times more often to other tweets than bots did. Real users gradually become more interactive, with the fraction of replies increasing over the course of an hour-long session of Twitter use.

What Type of Data Annotation are Necessary to Create AI to Spot Bots?

In order for AI to understand the difference between texts created by humans and bots, it needs to understand all of the nuances of language. This means that it not only needs to understand all of the words in a sentence but also its role in the context and of the entire thought users are trying to express. This means that entity annotation will be necessary, which is the act of locating, extracting, and tagging entities in text. Later on, entity linking will be necessary to connect those entities to larger repositories of data about them.

More detailed linguistic annotation might also be necessary, depending on the specifications of the project. This is where annotators are tasked with identifying and flagging grammatical, semantic, or phonetic elements in the text or audio data.

Trust Mindy Support With All of Your Data Annotation Needs

Mindy Support is a global company for data annotation and business process outsourcing, trusted by Fortune 500 and GAFAM companies, as well as innovative startups. With nine years of experience under our belt and offices and representatives in Cyprus, Poland, Romania, The Netherlands, India, UAE, and Ukraine, Mindy Support’s team now stands strong with 2000+ professionals helping companies with their most advanced data annotation challenges.