You are your Metadata

Metadata is data that provides information about other data. When it comes to online social networks (OSN), metadata are associated to most of the information we produce in our daily interactions and communications. Privacy concerns are rising for online social networks users and in response, OSN platforms have introduce controls for users to manage their data. Metadata has become an important component of the services offered by OSN platforms. For example metadata from Twitter provides information on users mentioned in a post, the number of times a message has been re-tweeted, when a document has been uploaded, just to name a few.

In this study, Beatrice Perez and her colleagues used Twitter as a case study to quantify the association between metadata and user identity. The authors claimed that the interactions of a user with a system could be used for identification, focusing their research on metadata. The goal of the study is to understand if it is possible to correctly identify an account given a series of features extracted from the available metadata. For that purpose, the authors developed and tested strategies for user identification through the analysis of metadata through machine learning algorithm such as Multinomial Logistic Regression, Random Forest and K-Nearest Neighbours (KNN). 

Their results showed that KNN provides the best performance with an approximately 96.7% accuracy.  They also showed that obfuscation strategies, which mean changing the values of each analysis, are ineffective. Indeed, it is possible to classify users with an accuracy of 95% after perturbing 60% of the training data.

As people are becoming more and more concerned about protecting their data, this paper helps to raise awareness of the privacy risk associated to metadata.


Cite: Perez, B. Musolei, M. and Stringhini, G. (2018). You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information. Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018).