Building the Drupal.org re-design community infrastructure: Administrators wanted

One of the biggest challenges in working in a large community like the Drupal community is removing bottlenecks. All too often the community can seem to come to a grinding halt on just one issue that can only be managed by one person. On Monday Dries gave a presentation at MIT and talked about how some of the Drupal community’s biggest problems have helped create some of our best solutions. In particular, he cited how our drupal.org server melt down in 2005 lead to the creation of the Drupal association to proactively manage and plan for our infrastructure growth.
Today the community continues to be challenged with re-designing and implementing drupal.org. On the surface, it doesn’t look much has happened, but improvements in projects, and search have made big strides in making Drupal.org easier to use. The Drupal community works well in parallel and through lots of small iterations. Unfortunately, our drupal.org development and staging infrastructure has not supported dozens of developers and themers working in parallel. We’ve now taken the time to build out that parallel development and staging infrastructure.
It is important to understand why a large site like Drupal.org can be hard to do local development for. First, drupal.org serves about 30 million pages a month, and has over 600,000 nodes. It has tens of thousands of attached files including patches, screenshots, videos, and graphics. Setting up a greater than 1GB database and syncing over 5GB’s of files is a tough task for a local development environment.
Much of the new design for Drupal.org is focused on making it easier to find what you are looking for on Drupal.org. For the redesigned development sites we also need to have a working Solr server. For example, a page like the new download and extend landing page has Drupal version search filters and at least seven blocks that are search results themselves. For themers to theme pages, their Drupal.org site needs content, graphic assets, patch files which need to be attached to issue queues, and the site must have working functionality.
Also, Drupal.org content is valuable even though it is licensed under creative commons license. If a copy of the site were to fall into the hands of spammers Drupal.org could suffer abuse. For that reason, we don't casually give away copies of the drupal.org database and code to anyone who asks.
I'll review this new infrastructure and explain how our team plans to work to support at least 10 development sites in parallel.

The Servers

Web server


The dedicated virtual machine at stagingvm.drupal.org contains 10 virtualhosts, and is entirely devoted to the redesign project. The webroot of this VM is /var/www.

Database Server


The database server is separate, and lives on stagingdb.drupal.org (which is a cname pointing at civicspace.drupal.org, the solr slave).

The Sites

There are 10 staging sites for the redesign, each of them are password protected with: drupal/drupal to keep out search engines and bots.
Staging 1 (staging1.drupal.org) through Staging 10 (staging10.drupal.org) are virtualhosts on the stagingvm.drupal.org server. Each site lives in it's own directory, /var/www/staging1 through /var/www/staging10, respectively. They are almost all setup with a recent copy of drupal.org.
Each of these virtualhost containers has a separate instance of drupal.org's codebase + sanitized database, and we will give selected volunteers the access to commit theming and administrative changes to an instance for testing purposes.

Backing up a site's database, restoring a site

The redesign administrators have two scripts in their stagingdb.drupal.org home dir, backup-tables and restore tables. The administrators have sudo and can run each script to backup or restore the last backup of a db. For example, if someone is going to test something on staging5 that will likely break it. You can ./backup-tables.sh staging5. Will take about 20 min and then ./restore-tables staging5 will restore the latest backup
Database copies of Drupal.org
Santized DB dumps of Drupal.org database are available on the stagingdb.drupal.org database server. The database server can hold 10 copies max. The problem there right now is that you need to take a dump of the drupal.org database server and then reload it into another database to run an SQL script on it. This continues to be a cumbersome process that can only be done by our two database administrators Narayan Newton and David Strauss.

Granting theme repository access

http://groups.drupal.org/drupalorg-redesign-implementers/guide
* Once the drupal.org re-design developers and themers have signed up for an account on infrastructure.drupal.org we need to review their request and determine if they are a likely candidate to contribute. We get a lot of requests from people who just want to run the drupal.org theme on their own site. Right now over 35 community members have write access to the SVN, and several dozen have read access so they can generate patches. Admittedly, this approval process is a significant bottleneck.
Once the contributors account has been approved they can issue the following commands:
 svn checkout https://svn.drupal.org/drupal/themes/bluecheese/
* They can also log into SVN using their username and password from http://infrastructure.drupal.org.
The documentation tab on the front page of the Drupal.org redesign implementers group for more information about how to get access and use SVN.

Pushing theme changes to development sites

There is an SVN directory called development-themes that will be checked out to each sites /themes directory. Each themer can set the default theme for anyone of the 10 instances to see their changes go live. Since Drupal can support many themes we may be able to support dozens of themers working on a site simultaneously.

Theme Deployment

We have now setup cron jobs to checkout themes from the themers sandbox every 5 minutes.

Syncing Drupal.org production with a master staging site

Code and configuration changes happen on Drupal.org. We need to push these code and configuration changes to staging sites to keep them in sync. The staging sites are managed via an SVN branch HEAD == redesign and the live site is the SVN branch DRUPAL-6--1 == drupal.org. All configuration changes should be in drupalorg.install updates.
Deployment of code, content, and configuration changes continue to be one of the big challenges in Drupal and might be the big feature of Drupal 8. Many of Drupal’s best core innovations come from drupal.org necessities.

Pushing development configuration changes to all the staging sites

When new features are being built on a Drupal.org staging site, we need to push these same configuration changes to all ten of the staging sites, or they might not work as the developers and themers need them to.

Asset management

Since themers work with so many graphic assets we need a way for them to more easily share their working assets and get them accessible to the Drupal.org redesign staging sites.

Infrastructure administrators wanted

Now that our redesign infrastructure is built, we need Drupal administrators and developers to help with the development, staging, and deployment cycles for all 10 of these re-design sites. I know we’ve asked for infrastructure administrators before and there’s been a lot of interest. If you are still interested contact me . A big thank-you to our infrastructure team for making this possible.