MATT CUTTS: OK.
Today’s question comes from Gus in Massachusetts. And Gus asks,
Many sites have a press release section or a news section that re-posts relevant articles. Since it’s all duplicate content, would they be better off removing these sections even with plenty of other unique content?
The answer is probably yes.
But let me give you a little bit of color about the reasoning for that.
So a lot of the times at Google, we’re thinking about a continuum of content, and the quality of that content, and what defines the value add for a user.
So let’s draw a little bit of an axis here and think a little bit about what’s the difference between high quality guys versus low quality guys?
Take somebody like The New York Times.Right?
They write their own original content. They think very hard about how to produce high quality stuff.
They don’t just reprint press releases.
You can’t just automatically get into The New York Times.
It’s relatively hard.Right?
At the other end of this spectrum is the sort of thing that you’re talking about, where you might have a regular site, but then one part of that site, one entire section of that site, is entirely defined by maybe just doing a news search, maybe just searching for keywords in press releases.
Whatever it is, it sounds like it’s pretty auto-generated. Maybe it’s taking RSS feeds and just slapping that up on the site.
So what’s the difference between these?
Well, The New York Times is exercising discretion.It’s at exercising curation in terms of what it selects even when it partners with other people, and whenever it puts other content up on its site.
And most of its content tends to be original.
Most the time it’s thinking about, OK, how do we have the high quality stuff?
As opposed to this notion– even if you’ve got high quality stuff on the rest of your site, what is the value add of having automatically generated, say, RSS feeds or press releases, where all you do is you say, OK.
I’m going to do a keyword search for Red Widgets and see everything that matches.
And I’m just going to put that up on the page.
So on one hand, you’ve got content that’s yours, original content–there’s a lot a curation.
On the other hand, you’ve got something that’s automated, something that’s more towards the press release side of things, and it’s not even your content.
So if that’s the case, if you’re just looking for content to be indexed, I wouldn’t go about doing it that way.It’s probably not worth just having automatically generated stuff that could be duplicate content.
Because everybody else has access to the same article bank directories, or the same press releases, or the same scraping of search results, or scraping of news results, or something like that.
It’s probably better to focus on, what is the value add?
What is compelling about your site?
What are the reasons why people will really return to your site?
Because just taking a single key word like red widgets and saying, OK, the value add of my curation is that I have selected the key word red widgets, and then everything else just runs in a cron job, and it’s a script, and it’s completely automated–that’s not going to add nearly as much value.
So that’s the spectrum.That’s the continuum.
And then if you’re doing things like, if the content is content that nobody else has access to, or if you’re writing your own content, or if you’re really putting a lot of effort into curation–.You can have like “Daring Fireball,” which is John Gruber’s blog.And he links to other sites.
But he decides what to link to.And he has his own editorial philosophy where he wants to highlight things that are of interest to him.
Whereas if you just have something that’s completely automated and everything that matches shows up, that’s really not nearly as useful.
There’s no editorial voice there. There’s no distinctive point of view that users would find compelling, and that would cause them to come back, and would make them really interested in your site.
So if all you’re doing is using that kind of a script, Iwould probably say drop it.It’s not worth the effort.
And sometimes when users land on your site, they’re like, OK, this seems kind of junky.
Why not just concentrate on the good stuff?
But hopefully this spectrum of the kinds of criteria that we would use in thinking about whether that’s useful to users, and thus whether we think it’s useful to have show up in the search results, gives you a little bit of an idea about when you really want to have those kinds of stuff and when you really probably would rather avoid it.
Hope that helps.
Quick Answer: Not really, unless you’re selective about what you repost, and have lots of unique content