So, after I claaimed we'd have less spammers than others, I wanted to find out how many spammers we've actually had.
mysql> select EXTRACT(YEAR_MONTH FROM from_unixtime(created)) as yearmonth, count(*) as count from users where status = 0 and login != 0 group by yearmonth order by yearmonth desc ;
Year/Month | # of spammers |
---|---|
2009 / 04 | 820 |
2009 / 03 | 710 |
2009 / 02 | 1101 |
2009 / 01 | 371 |
2008 / 12 | 171 |
2008 / 11 | 145 |
2008 / 10 | 136 |
2008 / 09 | 268 |
2008 / 08 | 486 |
2008 / 07 | 639 |
2008 / 06 | 145 |
2008 / 05 | 132 |
2008 / 04 | 149 |
2008 / 03 | 206 |
2008 / 02 | 167 |
2008 / 01 | 105 |
2007 / 12 | 85 |
2007 / 11 | 66 |
2007 / 10 | 96 |
2007 / 09 | 79 |
2007 / 08 | 112 |
2007 / 07 | 206 |
2007 / 06 | 136 |
2007 / 05 | 116 |
2007 / 04 | 98 |
2007 / 03 | 78 |
2007 / 02 | 64 |
2007 / 01 | 81 |
2006 / 12 | 46 |
2006 / 11 | 59 |
2006 / 10 | 67 |
2006 / 09 | 31 |
2006 / 08 | 34 |
2006 / 07 | 29 |
2006 / 06 | 28 |
2006 / 05 | 25 |
2006 / 04 | 17 |
2006 / 03 | 18 |
2006 / 02 | 15 |
2006 / 01 | 17 |
There are two interesting observatiosn to be made:
1) Spammers are most active on drupal.org in summer
2) Spamming on drupal.org is on the rise.
Especially the latter point is of concern.
I know that many readers will think "Why don't they simply deploy mollom on drupal.org?".
Unfortunately, this is not (yet) the answer. Most of these spammers do not create a single post on drupal.org, they merely use the high page rank of the user profiles to redirect the gullible to other sites. Mollom currently does not deal with user profiles at all. I guess that it would be interesting to use it for this purpose once support becomes available. However, I am not sure that the redesign of drupal.org will still use profile.module, so we maybe could standard node form support afterwards.
For the interested readers, here's the SQL query I use to find profile spammers:
select u.name, u.mail, concat('http://drupal.org/user/', u.uid, '/edit'), substr(p.value, 1, 60) from users u inner join profile_values p on u.uid = p.uid where u.uid > 430000 and p.value != '' and length(p.value) > 50 and u.status != 0;
Once a common pattern has been identified, you can confirm it with e.g.
elect u.name, u.mail, concat('http://drupal.org/user/', u.uid, '/edit'), substr(p.value, 1, 60) from users u inner join profile_values p on u.uid = p.uid where u.uid > 430000 and p.value != '' and length(p.value) > 50 and u.status != 0 and p.value like '%spam;
And then the grand finale:
update users set status = 0 where uid in (select p.uid from profile_values p where p.uid > 430000 and p.value != '' and length(p.value) > 50 and p.value like '%spam);