What Phishing E-mails Reveal: An Exploratory Analysis of Phishing Attempts Using Text Analyzes

36 Pages Posted: 2 Aug 2019

See all articles by Daniel E. O'Leary

Daniel E. O'Leary

University of Southern California - Marshall School of Business; University of Southern California - Leventhal School of Accounting

Date Written: July 26, 2019

Abstract

Phishers appear particularly interested in accounting and tax, with accountants and auditing firms as frequent targets because of the proximity to organizational resources. Since phishing is typically done using emails, we use text analysis to explore differences between phishing emails and other emails. Analyzing and comparing a database of phishing messages to a database of the Enron emails, we find that the phishing data is statistically significantly different across a large number of univariate text variable categories. Further, we generate a model of phishing as “power.” Using power as the dependent variable, independent variables of friend (who they pretend to be), achievement (of their goal), (to take your) money and (typically done at) work are used as a basis to estimate power in both the phishing and non-phishing messages and finds differences on the signs of the independent variables. Finally, using the output of a text analysis, we examine the ability of neural network models to differentiate between phishing emails and Enron emails, using size-matched samples.

Keywords: Phishing, Sentiment Analysis, Python NLTK, Text Analysis, LIWC

JEL Classification: M, Y

Suggested Citation

O'Leary, Daniel E., What Phishing E-mails Reveal: An Exploratory Analysis of Phishing Attempts Using Text Analyzes (July 26, 2019). Available at SSRN: https://ssrn.com/abstract=3427436 or http://dx.doi.org/10.2139/ssrn.3427436

Daniel E. O'Leary (Contact Author)

University of Southern California - Marshall School of Business ( email )

701 Exposition Blvd
Los Angeles, CA 90089
United States

University of Southern California - Leventhal School of Accounting ( email )

Los Angeles, CA 90089-0441
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
43
Abstract Views
248
PlumX Metrics