For years, spam and duplicate content has been in a battle with Google’s search algorithm. The obvious point of Google search is to find the most relevant and most original content related to your search query. For example, if you search for “San Francisco bars” you want all the web page results to be the original websites for bars in San Francisco, not some spam page that is trying to make money off advertising alongside relevant keywords that has nothing to do with what you’re looking for.
Luckily, Google has continually developed safeguards to distinguish original, relevant content, from spamy duplicate content. Google is able to determine with some certainty an ‘original’ source and place less weight on anything other than the original. Google’s entire page rank equation takes a number of factors such as incoming links to determine a pages’ relevance.
Google has also recently taken steps that allows publishers to mark their content as original through page meta data. A partner organization that then reuses the content (like anyone using an AP story) can mark their duplicate version as ‘syndicated’ to place less importance on it. This strategy, of course, only works for those publishers that decide to play by the rules. The original content tag could easily be manipulated by spammers.
One of Google’s principle engineers and head of the web spam team, Matt Cutts, recently shed some light on new changes to the algorithm that will further prevent webspam: “we’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content.”
Hopefully, these new changes to Google’s search algorithm will continue to place more importance on original content. In the end, if Google can more accurately locate original content, the technology as a whole will be far more useful to the billion or so people performing s
earches every day.