Is there any incentive to crack down on programmatic ad fraud?
"Everyone wants it to continue because they're making money." Why people don't want the ad fraud problem solved.
For nine months last year Gannett, publisher of USA TODAY and other news outlets, ran billions of ads in places that weren’t what the buyers wanted. Gannett and the buyers only found out about this after a March report in the Wall Street Journal. Earlier this week The Journal revealed that more than a dozen ad-tech companies failed to detect this, despite having all the information needed to do so.
We talked to cybersecurity and anti-ad fraud consultant Augustine Fou about this. He says the first instance was the result of a mistake. The second was intentional.
What happened at Gannett and why do you think it wasn’t intentional?
What happened was that the USA TODAY domain names were declared local. The reason I say it was a mistake and not deliberate is that the domains were misdeclared in both directions. If this were malicious, where the publisher is trying to make more money, they would always declare the local news sites to be the national one, not the other way around.
The bigger issue is that none of the fraud detection companies called it. None of the exchanges caught it and stopped it, and no advertiser agencies knew it happened right until the Wall Street Journal article hit.
Why is that more important?
A real publisher like New York Times, Wall Street Journal, USA TODAY, they have humans that go visit the site. OK? If you have a fake site, like fakesite123.com, no human would have ever heard about it and there’s no humans visiting that site. So how does that site have a ton of traffic and therefore can sell a bunch of ad impressions? Basically the fake site would use fake traffic, It uses a bot that basically is a browser that causes the page to load. When that happens then all the ads get called. So that’s what the advertisers are paying for. But the ads are not being seen by humans. That’s why we call it fraud.
But that’s not what happened here.
Right, this happens on fake sites, not necessarily on USA TODAY or quality journals. But the point is these fraud detection companies, it’s their job to detect the bots and detect other problems, like a fake site claiming to be a real one.. You know if the bad guys have fakes like 123.com, they’re not going to put their own domain in the bid request. They’re going to say they’re USA TODAY or whoever. They’re going to say this is my domain and the advertiser will submit their bids.
But the point is they didn’t catch any of the Gannett stuff. This is a legitimate publisher that made a mistake. So if they can’t catch that, how in the heck are they going to catch the cases where the bad guy deliberately misdeclared the domain?.
Why don’t they catch that?
Because they’re not even looking at the right places. I’m going to tell you my hypothesis based on my experience. So they would need to run their JavaScript and detect the page USA TODAY and then cross reference it to the domain that was passed in the bid request. They clearly are not doing that right. It’s so trivial. It’s so easy. They have code on the page that should be doing that. Their whole point is that they would find these mistakes or deliberate fraud and all that kind of stuff, but they’re failing at even the most basic stuff. so you know the March article from Wall Street Journal. Was that OK? They missed it. Today’s article says they had code on the page. They shouldn’t have missed it.
And they didn’t detect it because they weren’t looking for the right thing.
Correct.
Why aren’t they looking for the right thing?
I build fraud detection technology. I have a developer to actually code, I don’t code it myself, but I’ve been tuning the algorithm for the past ten years myself. So I can tell you that what happened, it’s no fault of their engineers,. They live in the code. They would not have accounted for these scenarios [like page fraud]. Maybe their code is tuned for looking for bot traffic and not this is stuff that occurs on the page itself. [A situation] where they should have run the code to detect the page, where it came from and then compared it to the domain that was passed in the bid request. So they may simply not have known to do that because they’re coders, they’re not ad tech people. They don’t understand how ad tech works and they don’t understand what constitutes fraud or not. So it’s hard for them to proactively catch any of this stuff.
Most of their work is reactive, like, oh, there’s been this huge botnet, huge amount of fraud that’s so obvious. For example, I’ll tell you something that came up yesterday. Twenty-eight million clicks were delivered on the same day to a particular publisher. OK, how is that possible. It didn’t even pass a gut check. Once they see that kind of stuff, then they go back and figure out what their detection missed, and then they try to catch up. It’s really like the arms race. Bad guys are always ahead and on occasion they mess up and we see something that we missed and then we try to update our algorithms. So, that’s why they’re missing a lot of this stuff. They simply didn’t even know to look for it.
So it’s like with computer security software. They can only look for what they know. They’ll miss anything new.
Exactly. So you know once one company sees a malware signature then they share it with everyone else. Everyone else can look for the malware signature,.
Does malware play a part in this?
Yes. How does malware make money? Historically, they’ve just harvested people’s passwords and other private information. Because it sits on your mobile phone it can listen to everything and most humans don’t turn it off, and when people are at home they have constant Wi-Fi access.
Now, the malware is loading ad impressions in the background. They’re making money through digital advertising because the advertisers don’t know that they’re paying for ad impressions that end up being loaded by malware. The advertisers want to buy 10 billion ad impressions,. There’s not enough humans to generate that much traffic. So then all of these fake sites will come in and will manufacture the quantities out of thin air and sell it to you.
Is this a fundamental problem with ad verification or is this something that can be dealt with?
From the fraud perspective it hasn’t been solved because people don’t want to solve it. Let me be a little more specific. The advertisers who are paying the money, they want to buy hundreds of billions of ad impressions. You can’t buy that much quantity without the fraud. Most humans visit a small quantity of sites repeatedly. That’s where you get the large quantities of human audiences. When you get into the long tail, there’s just not enough humans to generate that many ad impressions. The only way to do that is by using bot activity to repeatedly load the web pages and cause ads to load.
How does this work?
As a result, basically every middleman, every ad exchange, every publisher has incentives to use more fraud. So that’s why I said ad fraud has not been solved because nobody wants to solve it. Even the advertisers, even the middle men. Everyone wants it to continue because they’re making money. The main people that are harmed are the publishers. So the big publishers, newspapers, they now can’t compete against fake sites.
Maybe I’m naive, but I would think that as an advertiser, I’d want to get the actual views I’m paying for.
They don’t know. They think they’re getting it because they’re getting Excel spreadsheets that tell them how many ads they bought and how many clicks they got. They never asked the follow up question. “Are those real ads seen by real people? And are those clicks real?”
I’ve been writing about it for 10 years. Among the ad purchasers, they know it exists, but basically they’ll say, “Oh well, I think it happens to somebody else because [our ad verification firm] tell us that the fraud is less than 1%.”
In fact, I’ll show you in my article from yesterday: “One way to tell obviously fake bid requests is to see if there’s a deviceID present — Identifier for Advertising (IDFA) or the Google Advertising ID (AAID). So what do bad guys do? They pass a deviceID in the bid request. If the fraud detection doesn’t check if the deviceID is a real one, all they have to do is generate a random deviceID that has the same format as real ones. The fraud detection only checked for the presence of the deviceID, not whether it was real or not. So defeating that kind of fraud detection is laughably simple.”
Is there any point to asking you what can be done or what should be done?
We can’t incrementally solve this. We have to have the entire house of cards crash so that we can actually get back to real digital advertising and all that means is advertisers like CPG companies, financial services or whomever buying from real publishers like New York Times, Wall Street Journal, Hearst, Condé. That’s where the humans are.
So we’ve had ten years worth of fake sites and all the ad exchanges in the middle, basically spewing false metrics to say you got this many ad impressions. You got such a high clickthrough rate, so everyone thought it was working really, really well when it was 100% fabricated. Still, the way to solve this is we have to make this whole thing crash and come down so that we can go back to advertisers buying ads from publishers.
Read this: Gannett ad fraud mishap highlights concerns about programmatic advertising
Related stories