Godzilla, Snuffleupagus, and the Future of Search Success

I gave this presentation at SMX Israel. Rewritten after Google launched Search Plus Your World a few days before the conference.

Snuffleupagus Godzilla

“I’m so excited that our personalization works so well that people are creating dystopian fantasies about it.”

“How stupid would your friend think you are if he asked for the bus schedule and you gave him schedules for far-away cities? How annoyed would you be at your friend if he wasn’t paying attention to you so he answered your question without considering the previous parts of your conversation?”

Google Product Manager Jack Menzel, Best of SMX East

“Google Plus isn’t Google’s attempt to build a third or fourth social network. It’s the centerpiece of our attempt to understand our users better so we can better serve throughout the product line.”

Google Plus Product Manager Christian Oestlein, Best of SMX East

“Google is portrayed as Godzilla but sees itself as Snuffleupagus”

Jeff Jarvis, Best of SMX East

Yesterday’s release of Search Plus Your World was one more giant step for Google. We are living in the era of Google’s grand unifications.

Search, Social, and Serendipity

Every few months, even at search conferences, somebody gets up and declares the imminent demise of search or SEO. Sometimes they’re just looking for attention, sometimes they’re just idiots. But often the problem is they only understand half the story.

Yes, we often trust our friends more than we trust strangers. Yes, sometimes the thing you want can find you, without you looking for it. Yes, we want more than to type into a text box and see the same 10 blue links that everybody else sees.

But there was some mistaken impression that Google couldn’t or wouldn’t do these things. That Facebook or Apple would get there first. And that the end result would be that search, SEOs, and Google would cease to be involved in how people connect with what they want.

“It is easier to reliably detect social spam than link spam.”

Vanessa Fox, A Holistic Look at Panda

“I always tell people not to ask how does Google know. They just know.”

Vanessa Fox, Best of SMX East

Google is studying the social graph. They know who you are.  They know what you like, what sites you visited, and how long it took you to return to Google. They know how many people who visited those sites clicked +1 or bounced back quickly and hit “Block this site.”

We can argue about whether they’re Big Brother or your best friend. But they’re not a 1999 search engine that only knows about pages and links.

The Search Quality and Spam Groups

The gap between search and social isn’t the only key gap Google’s been filling.

Google decided years ago that the people should be represented by two separate yet equally important groups (as I wrote that, my Android phone made the Law & Order beeping sound that it makes for incoming messages. I think they’re watching me. I should stop using Chrome for writing blog posts). There’s the search quality team, led by Amit Singhal, who rewrote the Google search algorithm in 2001. Blindfolded. In Assembler. While drunk. And the search spam team led by Matt Cutts, who also serves as Google’s ambassador to the search community. [And of whom I do not make fun, because, well, he runs the search spam team, and is therefore probably more powerful than the president of France. Amit probably is too, but he doesn’t know who I am.]

Anyway, this division of power worked well, but had some blind spots exposed.

“So we did Caffeine in late 2009.  Our index grew so quickly … we basically got a lot of good fresh content, and some not so good.”

Amit Singhal

“It was like, “What’s the bare minimum that I can do that’s not spam?”  It sort of fell between our respective groups.”

Matt Cutts

Google had a blind spot, a vulnerability. Sites could achieve trust, and then mass produce highly targeted pages with thin content. Amit’s algorithms couldn’t quite recognize quality at a page level well enough to know when Google was overvaluing a highly relevant (but useless) page that was mass-produced by a site they had trusted. Matt’s team couldn’t help, because there was no spam involved.

SEO professional: I have a few sites where I had already grown the content and the links, and I was just about to start pushing out the rest of the pages when this thing hit.

Me: So that’s what you do? You build the trust with good pages, wait a bit, then exploit it with tons of crap?”

SEO professional: Isn’t that what we all do?

Conversation between me and a top SEO at a bar outside SMX West, 2011

SEOs were exploiting this gap between Amit and Matt’s groups to establish Google’s trust and then exploit it. Larger sites like eHow, which were once high quality, had a compelling business interest to create large amounts of content but little interest in paying for high quality. [Yes, I continue to think that eHow was unfairly hit because interested parties raised a deafening roar that prompted Google to modify Panda to crush it. But that’s another story].

Closing the Gap

So our heroes joined forces and created the Panda Update, which, unless you’re one of the sites affected, is more important for what it foretells than for what it has done.

First, Panda closes the gap between the two teams. They took some ideas from Amit’s bag of tricks. Panda is a ranking factor. It’s a computational intensive process run every few weeks, like Page Rank. They combined that with some of Matt’s tricks. Panda is a sitewide penalty implemented as ranking factor and updated like Page Rank.

Back in the old days penalties were applied by sledgehammer. -30. -50. -950. If Amit decided your page should rank #2 but Matt hit your site with a -950 penalty, you dropped to 952. And you know the difference between a -30 penalty and a -950 penalty?Right, 920. But in reality, nothing. Maybe in 1998 there was a tiny practical difference between being on page 4 or on page 96. But in 2012, neither will get you any traffic.

The Panda penalty is applied as a ranking factor and is far more subtle. If Google can’t find any pages that it likes better for a particular search, your page may still rank first, even if you’re being weighed down by the Panda. You can still find most eHow pages, for example, but you’ll often find them a few slots lower. Which was enough to do this to its stock price.

Demand Media Stock

Panda’s Polluting Publishers Penalty

With Panda, Google decided to make publishers pay a price for putting out bad pages.

When you expose a page to Google, Google considers it a request to trust you to send you its user. If too many users return unhappy, Google responds as it would to someone who has betrayed its trust.

So your job is to stop that from happening. The two obvious pieces to this are:

  • Satisfy the searchers Google sends you.
  • Don’t ask Google to send you searchers that you can’t satisfy.
Ranking well in Google is becoming less about what’s on the page and more about how Google thinks you’ve treated its users in the past.

Calling All Data

Google’s biggest change in recent years is that they finally declared that they’re considering every signal they can get their hands on. Well – almost. It’s still presumed that they’re not taking data from publisher-side programs like Google Analytics and using it against you. But Google is considering behavioral signals from user activity on Google’s sites, applications, and toolbars.

“In May 2010 I said that we weren’t using Social Signals. Now we are.”

“What we ended up doing is taking multiple data sources and intersecting them, so if one data source had a false negative, the others wouldn’t.”

Matt Cutts, Google Webmaster Video, December 17, 2010

“I always tell people not to ask how does Google know. They just know.”

Vanessa Fox, Best of SMX East

Previously Matt repeatedly answered questions with statements like “we don’t use that signal, it’s too spammable.” Now he answers similar questions by saying that Google considers so much data that even if you fool it on one signal you’re unlikely to fool it on all the corroborating signals. Former black hat SEOs have responded by saying things like “it’s now easier to build a good page than to fake it.” With Panda, Google explicitly stated that they were using some behavioral signals as corroborating signals.

User-Friendly Role Model: The IRS

One way that Google makes sense of the boatload of data that they have is by comparing data sets to paradigmatic models of good and bad web sites. Panda is a Document Classifier, which means it looks at things and puts them in Pile A or Pile B.

“We actually came up with a classifier to say, okay, IRS or Wikipedia or New York Times is over on this side, and the low-quality sites are over on this side.”

Matt Cutts, Wired Magazine, The Panda That Hates Farms, March 3, 2011

So your site needs to emit the same signals as sites like the IRS, Wikipedia, and the New York Times. So try to be a blood-sucking anti-capitalist rag. Kidding. But when was the last time you heard the IRS named as a pillar of user friendliness? If more of my visitors wanted to kill themselves after visiting me would Panda like me more?

Seriously though, Google’s use of a document classifier is significant. Combine this with the general idea of using many spammable signals and suspecting sites that don’t conform to expected patterns.

“You know the best way to ensure your site has a ‘footprint that sites that focus on users have?’ Focus on users!”

Vanessa Fox, Search Engine Land, Lessons Learned at SMX West, March 12, 2011

I partially agree with Vanessa’s point that if you focus on users you’ll look like you’re focusing on users. But it’s not always true. For example, Answers.com built a great site that spent millions of dollars annually licensing and aggregating quality content, much of it exclusively, from trusted publishers. The site focused on users. But it had the footprint of a scraper. We had some interesting internal arguments about showing Wikipedia content. It was the right thing to do for our users. But, even after we NoIndexed Wikipedia-only pages, it hurt our footprint at Google.

If search traffic is a big part of your business, you need to pay attention to whether you look more like the IRS or like eHow. Even if you think eHow is a better designed and more user-friendly site.

Google’s Timeless Problem, and Their Progress

“‘Junk results often wash out any results that a user is interested in …

The number of documents … has been increasing by many orders of magnitude. …

People are still only willing to look at the first few tens of results.”

Larry Page, The Anatomy of a Search Engine, 1998

Google’s core problem hasn’t changed: publishers keep creating a lot of junk, and users want the best page to jump to the top. In 1998 Larry Page complained that users were “still only willing to look at the first few tens of results.” Three peculiar words here. The word “still” indicates that Page thought that search engines would keep falling further behind in their battle to highlight the best content. But you’d need to eliminate the words “tens of” to get to today’s reality, where it’s hard for us to even fathom that our ancestors would look at tens of results. Probably while they were walking 3 miles in the snow to school every day. Uphill. Both ways. And Google has been so successful that users keep raising the bar, and complaining if the best page isn’t the top result.

The Future

 “The job of SEO has been upgraded from SEO to web strategist. Virtually everything you do on the Internet with your website can impact SEO today. That is especially true following Panda.”

Rand Fishkin, How Google’s Panda Update Changed SEO Best Practices Forever

“It’s too hard now to fake a natural footprint well enough to fool Google.”

“Human Engagement is the new Page Rank. Build engagement signals, get links from pages with good engagement signals.”

Greg Boser, Best of PubCon

When I started going to SEO conferences back in 2005 I got the feeling that I was witnessing the end of an era. SEO had been the Wild West, but the great frontier was finally being tamed.

It may take some time for people to acknowledge the new reality. And surely there are still some pockets of rogue activity, especially in the big money spaces like pharmaceuticals and gambling.

But search marketing is increasingly about applying the fundamental principles of marketing to the particular environment of search. And the key strategies are the same.

Specifically:

  • Build your reputation.
  • Build your relationships.
  • Learn what your potential customers want, and how they’re trying to find it.
  • Build pages that satisfy their needs and generate positive attention.
  • Look like a good business. Make sure you’re giving off the same positive vibes as the places that people trust.

As Google closes the gap between search and social, these fundamentals become increasingly important. For all but a few rogue geniuses who manage to stay one step ahead, SEO is increasingly about reputation and relationships. Earn  people’s friendship and trust. Understand what they want. And only signal Google that you can deliver what their users are looking for when you really can.

Good luck.