Oh no, we have too many links!

Aug 02

by Gerhard Killesreiter

Once in a while I log into google to look at their webmaster tools and what they say about drupal.org.

Yesterday, I did that again, after a hiatus of several weeks. I noticed, that google had sent me three mails which didn't make it to my inbox because I hadn't configured email forwarding.

After fixing that I looked at the mails.

One was abotu their services, they want me to use Adwords and offer some free budget. The other two were almost identical and they probably had sent the second one after I didn't react on the first:

Google thinks we have too many links!

They said that the huge amount of links may lead to googlebot not being able to index all of them and we should consider if maybe we could exclude some through robots.txt.

Helpfully, they also gave a number of example links. While some of them are perfectly valid links, others are indeed not needed. Most of these contain some sort parameter. Strangely enough, some of the sample-URLs are already among the ca 1 Mio local URLs that we block through robots.txt.

I have now added "solrsort" as a forbidden parameter and hope that it helps both google and us. Our search infrastructure has had the occasional hickup and if google searches now less, it sure is a win-win situation.