Google Data & Algorithm Updates, and Panda
- Index updates: Once upon a time, Google updated their index about once a month, then with “everflux” they began refreshing their index pretty much daily, and now with Caffeine Google says they “update our search index on a continuous basis, globally.”
- Algorithm updates: Google makes frequent algorithm updates. Most are quite small.
- Factor tweaks: They frequently play with different weightings (Danny gives the example of the importance given to the search words appearing in close proximity). It can implement these tweaks quickly and test them frequently.
- “Manual” factors: Thus named because somebody decides when to run the algorithm that updates each factor. These factors are expensive to calculate, and are therefore calculated fairly infrequently. PageRank and Panda are two such factors. Additionally, Google has been working on improving the Panda algorithm, so each time they’ve run Panda it’s been a somewhat different algorithm. Presumably as computing power gets cheaper and the algorithms get more efficient today’s “manual” factors will get refreshed more frequently.
There seems to be something extra at play with Panda, that a bad Panda score doesn’t seem to be completely overwritten in the next Panda update. Danny initially referred to Panda as a ranking factor, but later referred to it as a penalty. The fact that both terms are accurate helps show how Panda differs from previous factors and penalties, and why it was a combined effort between the search quality and the anti-spam team. Back in the old days Google often fought spam by applying draconian penalties on pages, the dreaded -30, -50, -60, and -950 penalties, to name a few. Panda seems to be a penalty that gets applied as a fairly subtle ranking factor, such that the page can still rank #1 if it appears much better than the competition, but will rank below competing pages that appear to be almost as good. And just dropping a slot or two on many queries is enough to significantly affect a website.
Panda should mostly make the Web better, though I’ll be happier when there’s a better feedback loop helping sites succeed in delivering quality pages to Googlers.