Fake accounts detection on social media platforms

Nowadays, people are more exposed to all sorts of abuse on Social Media Platforms (SMPs). The malicious intent of humans deceiving other humans is aggravated by the number of different types of SMPs and the vulnerabilities present in SMPs such as poor design and construction, large volumes of unstructured content, and the opportunities provided to humans acting in malicious ways. These factors all contribute to SMPs being extremely vulnerable to cyber threats caused by malicious users.

This paper addresses the detection of deceptive accounts created with malicious intent. These deceptive accounts are generated by humans or bots and can be used to defame someone’s character or conduct online bullying. For this research, the attributes found on authentic Twitter accounts were used to detect human identity deception. Twitter data was mined and cleaned to create an initial corpus of social media accounts to which were injected deceptive accounts. The authors focused on supervised machine learning, as the problem at hand is one of classifying whether human accounts are to be classified as “deceptive” or “not.”

The authors then used the corpus to train machine learning models in detecting identity deception within two different experiments. In the first experiment, only data from the corpus was used to detect identity deception. This data was based on the original attributes as found in Twitter, for example, their “number_of_friends”, also denoted as “FRIENDS_COUNT” on Twitter. In the second experiment, the original attributes used in the first experiment were then extended upon with new engineered features from psychological principles that identified deception and included previously engineered features that were applied to detect non-human or bot accounts. An example of such a feature is “gender.”

From those two experiments, an Identity Deception Detection Model

(IDDM) was proposed to not only detect but also interpret perceived deceptiveness. The model consists of two sub-models. The first sub-model, IDDMLM, showcased a machine learning model that detects identity deception on SMPs with a score of 86.24%. The second sub-model highlighted, IDDSM, which features a specific user, was found to be most likely deceptive about.

Cybersecurity, in general, can benefit from the research work presented in this paper, which deals with the development of intelligent identity deception detection using machine learning models.


Cite: van der Walt, E., Eloff, J. H. P. and Gobler, J. (2018). Cyber-security: Identity deception detection on social media platforms. Computer & Security, 78, 76-89.

Source: https://www.sciencedirect.com/science/article/pii/S0167404818306503