The workplace is a radically different environment since Information Systems (IS) became one of the strongest enablers for its processes. But IS can be misused. Security issues in IS are becoming more prevalent than ever before. In addition to the significant damage of these acts, the stakeholders deemed accountable for these events have faced severe consequences in cost and reputation.
The insider’s threat is a growing issue in organizations but it is poised to increase with the growing number of users and interconnected systems. Insider’s threat can be defined as a threat originating from users who have been given access rights to an IS and misuse their privileges, impacting the confidentiality, availability or integrity of the information deliberately or because of non-compliance. A thorough and timely review of the logs from the various IS used may help the detection of abnormal behaviours that can signal a misuse of users’ credentials. However, this task is complex, resource-intensive and may not suffice for the identification of the threat as the logs usually store information in a non-structured manner and capture myriad different events that may be non-related to the behaviour of the user in the IS.
In this study, Eduardo Lopez and Kamran Sartipi explored the use of an anonymized, very large dataset containing user behaviour (in log files), including both regular usage as well as misuse. In this context, Lopez and Sartipi presented a set of features that captured the user behaviour, and that can be analyzed for the purposes of detecting information systems’ misuse. They cleansed and aggregated the authentication events data, and from the resulting dataset they extracted the meaningful features of user behaviour in an IS. Finally, in order to demonstrate that the features selected do contain the information that would enable the detection of insider’s threat, the authors used a logistic regression classifier. The results showed that 82% of IS misuse could be detected.
This paper showed that it is essential that researchers apply critical thinking and practical focus to ensure the data can produce the knowledge required for the objective of detecting an insider’s threat.
Cite: Lopez, E. and Sartipi, K. (2018). Feature Engineering in Big Data for Detection of Information Systems Misuse. In Jennifer B. Sartor, Theo D’Hondt, and Wolfgang De Meuter (Eds.) International Conference on Computer Science and Software Engineering (CAS- CON’18). New York, NY. p. 145