Infrastructure Working Group Update

The Infrastructure Working Group was created a while ago and has since had a couple of meetings to advance the cause of our glorious infrastructure.

The charter and more info about the group can be found here: https://drupal.org/governance/drupalorg-working-groups/infrastructure

We'll continue to meet biweekly, meeting minutes will be posted here: https://drupal.org/node/2001508

Fun day...

Since I am among other things, a member of the Drupal security team, I sometimes get contacted about the security of particular modules or sites.

Today was such a day. A Drupal site developer had some suspicions about a contrib module being unsafe, since three of his clients' sites got "hacked". I asked about the symptoms and was told that a call to an advertising site got inserted into index.php.

This fact alone told me two things:

1) It is not Drupal specific; index.php is used by many PHP applications.

2) It is unlikely that Drupal was the attack vector used. Most systems do not allow the Apache user to modify PHP files.

A cursory look at the named module also didn't reveal anything particular unsafe.

I shared these observations with the concerned developer and I also suggested that somebody guessing the passwords or using a trojan might be responsible.

The internets are like totally evil

This isn't exactly news, but I was always assuming that Drupal people are a bit more honest and reliable. Turns out that I am pretty naïve.

As you know, we have these fancy download stats, thanks to a lot of people's work. On occassion, there are some troubles with them, and then Brandon looks at them in detail. Today, we blocked an IP which was requesting updates very often and with a lot of different keys. Either somebody's Drupal site is broken in a bad way, or somebody was trying to tweak the stats. We can't decide this based on the data we have, but if you find you don't get updates anymore, you probably should check your setup.

The real bummer was however, when Brandon started to specifically look for requests that were only out to game the stats for certain modules.

He found a module that reported several thousand sites using it, where almost all of the reports came from the same IP address.

Git migration shake-up improves average crawl speed for drupal.org

So, you have been wondering what the overall effect of the git migration was on drupal.org's performance but didn't dare to ask?

Here's the answer: I don't really know.

The reason for this is that at the same time we made two other changes: all the CVS related URLs were temporarily disabled and the issue statistics pages for each project were restricted to logged in users.

Why are the old DrupalCon sites blocked?

Some of you may have noticed it, but most probably have not:

the old DrupalCon sites that are still running Drupal are not accessible at the moment, they are locked by a htaccess script.

This is an unfortunate development, but in the end I didn't have any choice but to do this.

The reason for this is quite simple: the sites are unmaintained. With the associated DrupalCon, the various webteams dispersed and software updates weren't done anymore.

This means that the sites are insecure. And since they run on the same webservers as the main drupal.org site and all subsites as well as current DrupalCon sites, I had to act.
I should have acted much earlier. It is unfortunate that this caused troubles for some people who linked to the sites. But you can't really expect such a temporary site to be around forever.

Now, you can think that I should maintain the sites myself. But quite frankly I don't have the time for this.

What should happen now?

Get up early on Tuesday!

If you are like me, you probably weren't planning to show up at DrupalCon before Dries' keynote.

Here's a reason you might want to set your alarm clock early if you are attending DrupalCon Copenhagen.

We have managed at the last minute to get Scott MacVicar from Facebook to come to Copenhagen to talk about the HIPHPOP PHP compiler.

Facepalm

I've found some time to investigate some drupal.org server logs and found that while everything is generally working, there are some strange things happening.

Every full hour our access stats go up by almost 150%. I looked at the IPs that produce a lot of hits over the day, but they weren't responsible for these spikes. The spike is produced by a lot of different Drupal sites that request our update data when the hour strikes.

And why? Because we tell them to! In line 239 of our INSTALL.txt we instruct people who install Drupal to request our update stats at precisely that time. A classical facepalm.

Thanks to Varnish and the generally robust drupal.org infrastructure, this isn't an actual problem, but with the continued growth of the number of Drupal sites it might become one.

We are not alone!

Last week we've had some "fun" with Microsoft's msnbot. They were apparently trying out their new beta and it didn't work that well. It ignored drupal.org's robots.txt and kept crawling 20-40 pages per second. You could call that a denial of sevice attack, drupal.org sure had problems.

We then resolved this by banning the whole subnet from our webservers.

Today neclimdul pointed out to me that the people running CPAN testers have had the same fun as we did and resolved this in the same manner.

If Microsoft keeps doing this it will get cut off from indexing many Open Source projects. One could think it was intentional.

The Drupal Association supports OSUOSL

The Oregon State University Open Source Lab (OSUOSL) has been one of the most generous organizations to the Drupal project. In mid-2005 they stepped up and offered to host drupal.org at a time when the website was crashing due to insufficient hardware. OSUOSL generously offered rackspace and bandwidth, all of it donated. Since this donation the drupal.org infrastructure has grown from a single server to more than a dozen, traffic has increased exponentially, and overall growth has exploded. OSUOSL handled all of this in stride and even provided the time of student interns to assist with hosting and infrastructure issues.

More computer power

The Drupal Association has used some of the money that it acquired thanks to the Drupal community and its sponsors to buy more computing power for the infrastructure that all services of drupal.org are hosted on.

Pages

Subscribe with RSS Subscribe to RSS - Gerhard Killesreiter's blog