Reusing Web Content without Being Penalized
Reusing Web Content without Being Penalized
"Our organization creates huge amounts of content, created and 'owned' by different internal divisions. Much of this content is re-usable across divisions. However, we have heard that allowing the same content to appear on multiple web properties can cause penalties from search engines. How can we reuse content without getting blacklisted?" -- Keith Seabourn, Campus Crusade for Christ, International
I shared this question with Mike Grehan, author of the highly regarded Search
Engine Marketing: The essential best practice guide. I am reprinting
his answer in full on my website (www.wilsonweb.com/wmt8/se_duplication.htm).
One way search engines "weigh" webpages is by the file size or the number of
bytes. (Each letter or space contains one byte.) If they find webpages that
"weigh" about the same and contain the same pathnames and filenames (the part
of the URL that follows the domain name), they may identify it as duplicate
material and penalize the offending websites -- especially if these pages contain
identical hyperlinks.
Search engines especially see a lot of duplicate material in adult sites, though they recognize there are many legitimate reasons for uploading duplicate material. Renaming the directory and filenames of the syndicated articles will probably help. Also, don't host your site on the same IP address as other duplicate sites. (Sometimes the same IP address can be used for multiple sites.) You can keep duplicate material from being indexed at all -- and thus avoid any chance being penalized -- by using a robots.txt file or robots META tags (www.robotstxt.org/wc/exclusion.html).


