For all their ambiguity in meaning that’s currently vexing the courts, there’s one way in which emojis are increasing our understanding of who is saying what. Stylometry, the analysis of the composition and structure of writing, is a method by which questions of authorship can be investigated and authenticated. While stylometry has been around for many years, digital forensic examiners are increasingly taking advantage of technology-based tools to conduct probes into who wrote what. Previously limited to written words, construction, and punctuation, emoji usage is now beginning to help with the process of revealing the author hiding behind an anonymous post, tweet, text message, electronic form submission, or email.
There are three functions to stylometry, and we may be able to increase our accuracy by including emoji use in the samples we analyze. Author Attribution or Identification is the quest to put a name to a piece of anonymous writing by comparing it with samples from known persons. Author Verification seeks to confirm or correct the assumed author of an anonymously written piece by comparing the anonymous piece to a sample from the supposed author. Profiling by which we attempt to determine various elements of a person’s personality, educational level, native language/regional dialect, and/or profession by studying an anonymous sample. For digital forensics purposes, we are most often concerned with the first two functions – identifying and confirming who wrote the subject piece when the author is unknown or unconfirmed. While there are algorithms and computer coding languages, such as Python, that can be used to draw comparisons, that’s more trick than treat as far as this article is concerned. For this article, we’ll focus on how emoji usage can mimic the other tell-tale elements of written composition.
Frequency is one of the major predictors in stylometry; how often a person uses a particular word or phrase across multiple pieces of writing is as individual as a fingerprint. People use emojis in their social media posts in a similar fashion to how they dress – they wear a small portion of their wardrobes frequently, and likewise, tend to use the same emojis repeatedly. According to HR Daily, “71.2% of users used fewer than 10 emoji in the last 180 days. 50.7% used fewer than 5 emoji in that time,” (https://hrdailyadvisor.blr.com/2020/08/07/how-leaders-use-emoji-in-workplace-communication/). By determining that a person uses a particular emoji regularly in other samples, we can increase our confidence that the anonymous sample was written by the same person. If that emoji is a less popular choice, the odds of matching the author to a post increase more. In the case of a Memoji or Bitmoji, the author is either self-identifying, or we may be looking at a ruse of some kind.
Emoji positioning within a written piece is also a clue as to who’s behind the sample. Does the anonymous piece contain an emoji within a string of words? The placement of emojis within the text is a cultivated practice. Like the consistent use of punctuation marks within a sentence or paragraph, the placement of an emoji at the start, within, or at the conclusion of a sentence or paragraph is notable. In complex cases, we can drill down and look not only at the placement of emojis relative to text, but also if there’s a pattern to which emojis appear within the text and which do not.
Finally, one ingrained habit that frequent texters and chatters tend to adopt that long-form writers do not is the replacement or substitution of words or phrases with emojis or strings of emojis. Using emojis in place of words requires the user to hunt for the specific emojis that represent what they want to convey. Doing so in a string can take significantly more time than simply typing words. Thus, we can examine this style of emoji on the surface to eliminate a potential author, or we can parse the emoji frequency, selection, and string length to help increase the accuracy of our determination. Someone comfortable with creating messages or posts from strings of emojis must be very familiar and comfortable with emojis and most likely uses them frequently.
Emoji stylometry isn’t the future of digital forensics but looks to be another element that we’ll employ as the use of emojis in digital communication increases across all platforms. With that said, bad actors could carefully construct their posts to contradict their habitual writing style in order to hide behind the mask of anonymity. Looking at the use of emojis in our writing can be a fun and interesting exercise. Make no mistake though, if you suspect that an anonymous rant lambasting your organization’s management in an online review was posted by a vengeful former employee, Digital Mountain may have some tricks to uncover the true author.