Here’s how hard it is to spot a fake email address or phone number

The old line that the only problem with the internet is the human beings on it looks less and less fair—because the real problem has become the non-human beings on it.

Bots, scripts, and other forms of fake accounts routinely wind up as the villains in modern instances of online misbehavior. And, as the scourge of robocalls and spam texts shows, the problem is growing in our less-digital communications channels, too.

Technology can give us tools to deal with this plague of literally inhuman opponents. But developing ways to authenticate users in mediums that weren’t built with verification of individual accounts in mind isn’t easy.

WE ALL KNOW PHISHING IS A BIG [EXPLETIVE] PROBLEM.”

JOSHUA KAMDJOU, SUBLIME SECURITYOne example comes from an anti-phishing firm called Sublime Security. Last summer, that Washington, D.C., startup launched a service to provide a trustworthiness score to email addresses, automatically and almost instantly.

Cofounder Joshua Kamdjou explained how this EmailRep service works in a talk at ShmooCon, a hacker conference held in Washington a few weeks ago.

“We all know phishing is a big [expletive] problem,” Kamdjou led off. He noted such accelerants as the widespread and cheap availability of tools to send phishing campaigns and the ability of attackers to fool older schemes designed to catch fake email addresses, such as Sender Policy Framework and DomainKeys Identified Mail.

EmailRep works by collecting and collating multiple signals—the technical details of the servers behind an email address, the existence of the address in profiles at a variety of social-media sites, how long the address appears to have existed, and even if it’s shown up in less recent data breaches.

Kamdjou demonstrated that by testing three Gmail addresses that, by including some combination of his first and last name, looked plausible. EmailRep quickly rated each as “suspicious.” His actual email address: “Not suspicious,” as judged from such factors as it being “seen in data breaches or credential leaks dating back to 03/22/2012, but not since 05/24/2019.”

This technique requires serious money to run at scale; in a conversation after the talk, he said each query costs Sublime 3 cents. But although some outside firms have begun to incorporate EmailRep analysis into their workflow, the idea here isn’t to turn EmailRep into a self-contained service.

“When we built it, we did so to fill a need for ourselves for our primary line of business,” Kamdjou said via email. “Our use case is highly optimized for our phishing defense use case.”

That sets Sublime apart from an earlier fraud-detection firm, Emailage, which sells its Email Risk Score service to businesses trying to avoid fraud attempts.

PHONE NUMBER INSECURITY

Phone numbers represent an even gnarlier security scenario (if your phone hasn’t lit up with a robocall since you started reading this, consider yourself lucky). As another ShmooCon speaker, Twilio security advocate Kelley Robinson, observed in a talk about anti-robocall efforts at that conference: “There isn’t a ton of security in the telephony network right now.”

An effort announced Thursday to help companies catch and stop spam text messages therefore had to start with feeding computers a steady diet of SMS spam. Validating unknown numbers remains so difficult—thanks to all the hops a call can take through multiple phone networks—that some firms have found it’s easier to build systems to approve designated calls or texts from companies willing to pay for that vetting and labeling.

RealNetworks (yes, it’s still around) announced a collaboration with Syniverse (yes, the company behind all of those Valentine’s Day texts that resurfaced mysteriously last November) to put its Kontxt machine-learning service to work on Syniverse’s message-routing platform.

“That model has seen many, many hundreds of thousands of spam messages,” says Kontxt general manager Surash Patel. “It looks at the bit map of new messages and says, does that look similar to something I’ve been seen before?”

The model also allows for positive ranking of messages—for example, putting the highest priority on those delivering the two-step verification codes that protect many online accounts.

That approach can allow a screening service to get good at stopping known bad context—”It’s always going to be as good as the information as it’s had in the past,” Patel says—but can fall short when it comes to more . . . creative content.

And even the best-trained message-classification systems can still screw up. One example: I initially missed the email tipping me off about this RealNetworks/Syniverse news—even though it was sent by a publicist I’ve known for about 20 years and from an address that person has used to communicate with me for about a dozen years. You see, Google decided to dump it in my spam folder.