I just read a post by Don Dodge on the Google Click Fraud data, where he quotes a Google report showing that less than 10% of all clicks are fraud and that less than 0.02% of the frauds get through to the advertiser.
That sounds too good to be true. Let me give you some data to prove my point.
We've been tracking ads-clicks on Sampa sites. Since Google Adsense doesn't report back to us which pages were the most effective, we created a solution that sounds very reasonable (lots of services do that).
Here is the last 3 days of data that we collected:
Feb/28: 60 clicks on Ads
Feb/27: 45 clicks on Ads
Feb/26: 42 clicks on Ads
Now, this is what Google Adsense tells me:
Feb/28: 29 clicks
Feb/27: 28 clicks
Feb/26: 28 clicks
That is anywhere between 40-50% less than what we are measuring. So why is Google eliminating so many of those clicks from our account?
There are some explanations from our side:
Our data is collected in UTC timezone and Google's data is on PST.
The script that we use to measure clicks might indicate a click on the "Ads by Google" or "Announce on this site".
First of all, the timezone shift can't really be responsible for multiple days. If a click was counted today in our system, but yesterday on Google's system everything should even out at the end of the month (or pretty close to it).
The possibility that users are clicking on "Ads by Google" or "Announce on this site" are pretty real, but it is unreasonable to think that 40% of our clicks are on those links.
So, here is Google telling us that only 10% of clicks are fraud, and I'm seeing them removing more than 40% of clicks on our sites. Sounds like a pretty big disconnect to me.
And, yes, there is the possibility that Sampa sites have a larger percentage of click fraud then other sites, but it is hard to see the motive since our users don't make money out of Adsense, and nobody associated with Sampa is allowed to click on any Adsense ad (we are that afraid of Google cutting us off)
Just one final note (for the purists), we do remove multiple clicks from the same IP, because we assume that Google does the same, so the number of logged clicks on our side is much larger, but we do our own "fraud detection" and cut that down by about 50%.
Reading how the Wired reporter Annalee Newitzbought her way into Digg it has become very clear that Digg is done.
My first Digg happened more than a year ago and it was amazing. The crowd was really in control and it was a fair and legit system: users vote for the articles they like the most, and the best bubbles to the top.
Now, "diggers" have grouped into gangs, and "digg" or "bury" stories using obscure agendas, mostly because of monetary rewards.
So, immediatelly I started thinking how can Digg be fixed...
TechMeme uses a different method of defining what is popular and what is not: how many people have linked to that page recently. What if Digg would use a TechMeme-like technology just to validate the votes.
I mean, when my story got "dugg", quite a few blogs linked to it because it was truly interesting (IMHO). If you see a Digg story with 100 votes, but no backlinks it sounds very suspicious.
Another solution is to add a reverse weight to each Digg user based on the number of votes they have. If a user votes just 2 times a day, that is worth more than a user that votes on 20 stories a day. Or, is it? Hummm... Just thinking out loud now.