SMX West, Content Farms, and How to Pour Water into a Glass Cup

[Dear reader: A funny thing happened with this post. I wrote it to show how it’s OK for eHow’s page to rank on “how to pour a glass of water” since it will only be found by people searching for that information. But then I was surprised to get a lot of search traffic to this page, using those keywords. Can you PLEASE tell me in the comments why you’re here? Is your current method of pouring water not working for you? Are you switching from ceramic mugs to glass cups and want to make sure the methodology is the same? Are you looking for the latest technological advances and new best practices on the water pouring front? Are you trying to understand content farms or SEO? Please help! Thank you.]

Fantastic discussion about Content Farms at SMX West. Instead of a mob with pitchforks we got an intelligent discussion of what these sites do right, how they’ll likely evolve, and what other sites need to learn from them.

How to pour water
How to pour water

Matt McGee started by discussing how the definition of a content farm is elusive, seems to be associated with the following characteristics:

  • Has a lot of ads
  • Targeting search terms
  • Large amount of content gets produced

Gave examples of ridiculous content on eHow such as

  • How to make toast
  • How to boil water

[Another speaker later added examples like

  • How to pour water into a glass cup
  • How to pour water from a pitcher

Now these pages are indeed funny. But what’s exactly the problem, other than that some people’s parents never taught them how to boil water? Or that people are inexplicably Googling for info about how to pour water? Yes, this page is funny ( but as long as it only shows up for searches inquiring about how to pour water, it’s not nefarious, evil, spammy, or low quality. It’s also an interesting display of an inherent flaw in using the social graph to rank search results. 50 people liked this page on Facebook. Google & Bing have no way to know that it was probably Liked as an example of useless content, so they bump up the score for this content thus risking that it will rank decently for some other searches as well.]

Matt says that these sites are associated with low quality, but in reality also often have higher quality content. He’s successfully relied on content from some of the sites that get labeled as CFs.

For the last couple of months there have been some complaints about Google decline in search quality and blaming it on CFs

More recently, the Farmer / Panda update. In the works since Jan 2010. Effects 12% of search changes. Still US only

Immediate analysis: clear losers, ezinearticles, Associated Content, hubpages

Others generally associated with CFs like eHow did not seem to suffer

Next was Luke Beatty, VP / GM of content at Y!, oversees the Y! Contributor Network, formerly known as Associated Content, which he co-founded

Y! Contributor Network has the same premise today as when Associated Content was founded:

Create an open democratic platform where people can publish content.

>400K contributors

Has multiple formats (text, video, etc) and both solicited and unsolicited content.

They try to push things onto the Y! network, sites like Shine, Y!Sports, Y!News, Y!Finance

Only fragmented content that doesn’t have a home gets published directly on AC.

Panda results: Google referrals up for 1/3 of their content, down for 2/3. [I suspect the Downs were generally of greater magnitude than the Ups. The primary Ups from Panda seem to be just the small rises that come from some of the sites above you getting smacked]

93% of the site’s assets remain indexed on Google [Very misleading stat. Issue is ranking, not indexing]

Changed on an Asset by Asset basis, not a property basis

Overall Y! Traffic is up. Particularly on Y! Answers. Other crowdsourced content went up.

Puts more pressure on contributors to create higher quality content.

Jury still out on whether or not Google is good at which of our content is higher quality than others. [That’s one of the rubs of the Panda update. Seems to be based on factors, including paying for human judgments of a site’s trustworthiness, that would be too noisy on a page level, and are only useful on a site level. So within a site, little correlation between quality and ranking.]

He discussed the evolution of Crowd-Sourced Content Distribution

2005: Context = Contextual Advertising

2008: Context = Hyper-targeted display. Ace Hardware wants 100s of articles on drills and drillbits.

Interesting stat: Consumer influence. 70% of the content you consume on a daily basis is crowd sourced (source EMC). 30% by someone you know (source PEW) [Includes Facebook, Twitter, etc]

2011: Context = Content Marketing. Brand relevant assignments. Create content that big brands want.

Getting a Walmart fan to participate in the conversations that Walmart needs to be in. “Moms like me” program.

Luke’s summary:

  • If content farm = creating content at massive scale then fine.
  • Just like eBay is commerce at massive scale.
  • If I can verify that you’re human and writing reasonable content I can get your content in front of an editor.
  • We’re committed to increasing quality

Next was Tim Ruder from Perfect Market.

He spent his career in marketing content for traditional media, starting in 1995 with Washington Post.

The values that traditional media bring to the table are important values to keep.

But the economics have some challenges

We can look at CFs and see what can be applied.

CFs vs Traditional Media


  1. Systematic processes to write large amounts of content
  2. Data and tools to help them understand what people are interested in
  3. Have data that help them understand what kind of commercial demand exists for each of those topics
  4. Amortize over a long period of time

There’s an organizational disconnect among news sites.

They think in terms of sections. Health gets Health ads. Entertainment gets Movies. Main gets an odd mix of furniture ads, etc.

Need to think about Atomic content with people coming in directly from search engine & deep links

Newspapers: The daily miracle. Then you start fresh in the morning on the next day’s miracle. Content is not considered in the longer term fashion.

The knock against the CFs, which I think is rightfully placed, is that the content can be very low and thin.

The market pressure is forcing a change on that.

The Panda / Farmer update puts a lot more impetus on the need for quality for everybody, including CFs

Most of the mainstream press was not dramatically affected

CFs will improve their quality over time. [Agree wholeheartedly. The CFs know that long term success and stability requires significant improvements in quality.]

The news published today by NYT, CNN, Washington Post is being subsidized by the offline versions of those products

That subsidization gives those publishers a dramatic advantage. But it’s going to narrow. Fewer offline purchases, and the competitive landscape is changing.


Traditional publishers have to learn lessons from CFs

  • Understand and leverage consumer interest
  • Capture better revenue streams from thinking about content in a more granular way
  • Understand durability of content produced
  • Take full advantage of the moment in time that these publishers have today to put these things in place before their competitors get better and better and we lose what we have from those mainstream publishers

Next up was Byrne Hobart from Blue Fountain Media

How did we get here? Where did we start?

The old style: Supply Media

  • Content based on outside sources: News, Art, etc.
  • Monetization: get a big audience sell ads, hope they notice
  • This is a great way to create content that content creators like

“Demand” media

  • Give people what they want. That’s happened for years; we’re just targeting better
  • This is what pop music is, etc.
  • Monetization: Highly targeted contextual ads
  • RPM of $13.45 for 2010 up 26% (from Demand filing) [I need to check this]

Worst case scenario for a publisher is that you solved the user problem. Better is to give just-enough content and get the user to click on ads for the full answer. [This is another key issue within CFs. Are you creating Mahalo coupon pages that rank and force users to click on ads to get coupons? Or pages with the content users are looking for? I think eHow usually does the latter.]

Ridiculous economies of scale.

Great for old domains or for celebrities: Rachel Ray, Lance Armstrong, Tyra Banks. Celeb gets the attention. Then long tail, scalable, highly monetized. They already have celeb-driven sections for food, health, and beauty. Don’t be surprised to see somebody like Suze Orman do a personal finance site for them. [Makes me think of what Calacanis called “the façade strategy” regarding Huffington Post. The site has an impressive face. But the real money comes from mass produced long tail content that gets traffic because of the celebrity involvement on the showcased pages.]


  • Anyone with more money than time
  • Online absentee landlords
  • Lots of nontraditional employees (Demand has a lot of stay at home moms)
  • Searchers looking for ultra long-tail terms
  • Shareholders


  • Anyone with more time than money
  • Traditional media
  • Active bloggers
  • Anyone trying to get into the industry
  • Searchers looking for head terms

Congratulations fellow SEOs! We’re now Old Media [Ha!]

This is the Industrial Revolution hits content creation: Content Factories [I love this analogy. And Content Factories was the original term coined by 2 months later ReadWriteWeb changed the term to Content Farms, presumably to make them sound more sinister.]

  • Separates content creation from site ownership
  • Separates content creators from the marketing process
  • More efficiency

[Assembly line vs one proud owner doing it all. Differentiated labor, straight out of Adam Smith, Henry Ford, etc.  Insulting to the craftsmen of the earlier generation. And a key difference is that we may love mass-produced cars and sweaters, but in the case of content, we need differentiation in the result, not many duplicate copies of the same content.]

Biggest labor arbitrage since China, since there’s so much unexploited time [very interesting point. Sort of insourcing, instead of going abroad getting people from home]

Common responses from SEOs:

  • Complaining to the press
  • Complaining to Google
  • Complaining on Twitter

Better options:

  • Shoot for quality
  • Sneak by with Social Media
  • Dust off your e-mail newsletter
  • Made- for Made-for-AdSense. Buy ads that will get you on those sites and satisfy those users. Google will figure this out. This is how to be ready for that.

Finally, Matthew Brown from AudienceWise, the former director of search strategy at the NYT (including, the granddaddy of this model)

Weren’t specifically targeting CFs.

They’re looking for low quality across any network.

eCommerce, comparison shopping, scrapers all got hit.


How we got here: Domain authority

  • Ability to rank thousands of pages by domain trust / time
  • Longtail rankings add up to a critical mass of content

NYT empty topic pages ranked very well, for example

Two views of Farmer (aka Panda) update:

  • A sitewide filter on domain authority
  • Farmer factor applied to the site’s normal ranking ability

Domains that got hit got hit across the domain

Possible signals:

  • Quality vs Quantity ratio
  • Big sites relying on domain authority
  • Small sites with few quality pages
  • Sites with an overload of ads / links

Survivors have thatElusive brand smell:

  • Google news inclusion
  • Blog search inclusion
  • Tweets / shares / likes/ Reddit / dig
  • Balanced ratio of deep links vs index page links

“eHow got unfairly swept up in this” [That’s backwards. Demand’s pre-IPO media campaign that began with the Wired article about eHow started the backlash that has now hit many sites, but not eHow]

How do I get out of this? (or not get into this)

  • Clean your site up
  • Build out brand signals
  • Channels / domains
  • Tighten editorial focus
  • Scale promotion


Q: Do Content Farms have a bad model or is it just an execution problem

A: Byrne: The problem is the quality of the content. Yeah, it’s an execution problem.


Q: Are there other content suggestions recommended to help with top rankings?

A: MattB: I like video. Other things that can get you into the 1-box: images, get into news or at least into blog search. Any signals to show you’re not just targeting a random keyword phrase.


Q: MattM: What’s Google’s role in this? Are they responsible for all of this?

A: Byrne: Sure. It’s not a perfect algo and there will always be an arms race. In quantitative terms the experience hasn’t been degraded. They’re not doing multiple searches or bouncing more. CFs are probably good for Google. We’re not typical users. The fact that we know the difference between Google, the internet, and IE probably puts us in the top 10% of Google users.

MattB: They’re business model relies on user trust and their brand. The money coming in from CFs isn’t worth compromising Google’s trust and brand.


Q: Quality signals: A lot of these relate to the design of the website. Is that more of a factor now?

A: MattB: I think it is. Matt Cutts  used to be asked are you looking at bounce rates, etc., and he said it’s too noisy and too easy to game. I don’t think he’s said that recently. They may be able to extract stuff out of that noise. The Chrome extension. They’re going to start looking at those signals. This is going to get worse as it goes on for sites with poor quality feedback. eHow didn’t get slammed, because their site design is fairly clean even if people think their content is shaky.


Q: Y!CN strategy going forward:

A Luke: Surface the best CS content and push it to Y! properties. We’re not trying to compete with the high quality sites, we’re trying to cover what isn’t covered.


Q: If you have been hit and can remove low quality pages should you remove it all at once or phase it out?

MattB: I’d do it all at once. Remove the pages. Redirect if that makes sense. Get rid of pages that have lots of ads & links and very poor content. Nobody knows how long it takes to get your traffic back. We’ll start seeing that data.


Q: These CFs rank well even for head terms. Should we avoid them from a linking perspective?

A: Byrne: If they rank, probably not


Q: Is writing 20 very similar articles still unique but w/ different titles spam?

A: Byrne: Probably not from a user experience.

Luke: Unless the user is being pushed a Related Content bucket with those articles


Q: Did paid search have any role in this update?

Luke: We’ve never paid for traffic so I don’t know

MattB: Don’t think it was related at all

Tim: No impact.

[In summary, great session. You can and should learn a lot from content factories, in terms of how to create evergreen content to meet the demands of consumers and advertisers. These factories will become higher quality, or they will go away. And they know it.]

Also see the session coverage from: