Finalmente, y después de unas semanas de silencio acerca del problema de la pérdida de paginas en el índice de Google que actualmente esta preocupando a muchos webmasters (ver nota), Matt a escrito en un comentario de su weblog lo siguiente:
maxD, last week when I checked there was a double-digit number of reports to the email address that GoogleGuy gave (bostonpubcon2006 [at] gmail.com with the subject line of “crawlpages”).
I asked someone to read through them in more detail and we looked at a few together. I feel comfortable saying that participation in Sitemaps is not causing this at all. One factor I saw was that several sites had a spam penalty and should consider doing a reinclusion request (I might do it through the webmaster console) but even that wasn’t a majority. There were a smattering of other reasons (one site appears to have changed its link structure to use more JavaScript), but I didn’t notice any definitive cause so far.
There will be cases where Bigdaddy has different crawl priorities, so that could partly account for things. But I was in a meeting on Wednesday with crawl/index folks, and I mentioned people giving us feedback about this. I pointed them to a file with domains that people had mentioned, and pointed them to the gmail account so that they could read the feedback in more detail.
So my (shorter) answer would be that if you’re in a potentially spammy area, you might consider doing a reinclusion request–that won’t hurt. In the mean time, I am asking someone to go through all the emails and check domains out. That person might be able to reply to all emails or just a sampling, but they are doing some replies, not only reading the feedback.
Interpretando estos comentarios, y asumiendo que Matt no esta ocultando algún problema con la nueva infraestructura, como por ejemplo la falta de espacio de almacenamiento, se puede deducir los siguiente:
1. La flojera del googlebot para visitar muchos sitios, se debería al cambio de prioridades del crawler en BigDaddy.
2. La disminución de las páginas que aparecen indexadas con la función site: se debería a los filtros de SPAM que aparentemente se están aplicando en el índice del buscador.
3. Este fenómeno que también esta afectando sitios sin SPAM parece ser un efecto secundario e involuntario de los nuevos algoritmos que Google esta aplicando.
Ahora, solo falta esperar que los ingenieros del Googleplex que están analizando los emails que ha recibido GoogleGuy ayuden a resolver el problema rápidamente.
Google+
Si, y justamente en ese comentario Matt habla sobre el particular. También puedes leer más sobre este tema aquí