Technical SEO

SEO: Forgotten Content Can Damage Search Performance

Sometimes a site’s own past dampens its bright SEO future. The forgotten designs, old campaigns, and abandoned domains that accumulate unnoticed in a web server can have a negative impact on a site’s ability to rank and drive natural search-referred traffic and sales. Search engines retain records in their indexes long after site owners have forgotten the content exists. That old content could be hurting your natural search performance.

Abandoned Domains

Abandoned domain names can be a source of forgotten content. Companies tend to register many domains to protect their brand names. Occasionally companies host duplicate versions of the primary site on all of the domains they’ve registered. But it’s easy to forget that content. For example, someone in the legal department registers the domains, someone in IT hooks them up, and no one thinks about them for years. But search engines don’t forget.

Coldwater Creek is a case in point. It inadvertently allowed its entire site to be crawled and indexed at https://www.coldwatercreek.org, https://www.coldwatercrek.com, https://www.thecreek.com, and many other variations. Notice that these examples are all on the secure https protocol. Coldwater Creek has optimal 301 redirects on the non-secure http protocol, preventing the duplicate content issue there. Placing the 301 redirects on the secure protocol as well would remedy situation.

Finding alternate domains that may be live without your knowledge is as easy as searching for a string of text that’s unique to your site and excluding your domain from the results. For example, try Googling the following for Coldwater Creek:
Careers at Coldwater Creek Investor Relations Social Responsibility Terms of Use © 1984 -site: coldwatercreek.com. Note the appearance in the search results of what appears to be multiple domains related to Coldwater Creek: www.coldwatercreek.com, www.thecreek.com, www.coldwatercreek.org, and www.coldwatercreek.net.

Legacy Content

Content becomes “legacy” when a site redesigns, rewrites URLs, discontinues products, or when campaigns expire. The SEO optimal action for legacy or discontinued content is to 301 redirect it to the most relevant alternative. More frequently, it’s just removed from internal linking structures or promotions to it stop. It’s abandoned in its live state. Because the URLs load with a 200 OK server header status when requested, the search engines retain them in their indexes.

For example, New Balance has a set of URL rewrites in place at the initial category and product level. Unfortunately, it neglected to 301 redirect its legacy URLs to the newly rewritten URLs. At the category level, we have http://www.newbalance.com/productList.php?cat=2 and http://www.newbalance.com/outdoor/footwear/ loading the same page of content. At the product level we have the legacy http://www.newbalance.com/get_product.php?style=WW811/&cat=6&subcat=5&ptype=1&g=w and the rewritten http://www.newbalance.com/fitness/walking/WW811/ loading the same page of content.

In other instances, legacy content can be a missed opportunity rather than a duplicate content issue. Contests and promotions that generate buzz can accumulate a decent number of backlinks before they expire. Leaving them live to wither away is a wasted opportunity to harvest that link popularity to strengthen the site. For example, the band Linkin Park had a MySpace contest in June 2008. The contest promotion page is still live at http://linkinpark.com/myspace-contest and has acquired a modest visible PageRank two, with 25 external links from 10 unique domains linking to it. A quick 301 redirect to another relevant page on the site would put that link popularity to better use to strengthen content that the band actually wants to rank.

Finding legacy content requires patience and determination. It can be uncovered with a site crawl and by analyzing Google’s indexation data. For example, I uncovered New Balance’s legacy URLs by Googling a product number in their URL combined with a site query, like this: site:www.newbalance.com inurl:WW811. Then I did an intitle query to dig up the category example: site:www.newbalance.com intitle:”See all Outdoor by New Balance.”.

Summing Up

There are multiple sources of forgotten content. It’s in a merchant’s best interest to locate that content and either redirect it or eliminate it. Failing to do so could hurt an ecommerce site’s natural search performance.

Jill Kocher Brown
Jill Kocher Brown
Bio


x