Presented by Pavan Kumar as a part of the 2020 Serene-risc Workshop on The State of Canadian Cybersecurity Conference: Human-Centric Cybersecurity
About the presentation
Recent advances in natural language processing and deep learning methods have facilitated security researchers and government agencies to analyze large-scale text data emerging on social networking platforms such as Twitter, by reducing the necessity of having a large corpus of examples from each user to learn patterns within user profiles. Moreover, a user’s personality traits have shown to have strong influences on their activity and security decisions in online social networks. In this presentation, we will introduce a novel contextual language embedding mechanism obtained from a deep neural network trained to classify the personality traits of a user using a sequence of their tweets. We will discuss the potency of utilizing this text-to-vector representation to encode tweets, retweets, and replies as edges in a graph-based social network representation of users. Finally, we will share the findings from our study of employing the personality-aware language embedding on a given set of user-profiles to construct and train graph neural networks tailored for the task of closed-set user identification. We show that a user can be recognized based on the patterns in their online social activity, their personality traits and also the personality traits aspects of the users that they mention and follow.
About the speaker
Pavan Kumar K. N. received his Bachelor of Engineering degree in Information Science and Engineering from Visvesvaraya Technological University, Belgaum, India in 2015. He worked at Mu Sigma Business Solutions Pvt. Ltd., Bangalore, as a Research Analyst from August 2015 to October 2018. He is currently pursuing an M.Sc. degree in Computer Science at the University of Calgary, Canada, under the supervision of Prof. M. L. Gavrilova. His research interests include social behavioral biometrics, social network analysis, natural language processing, text-mining, computer vision, and machine learning.