SEO - Optimising WordPress to avoid duplication issues

Anyone who has looked into making their website search engine friendly will understand that Google will strike down upon (with great vengeance and furious anger) those who attempt to poison the web with duplicate content.

This is a good thing. None of us like to go through all the effort of clicking on a mouse to discover that they have, in effect, been surfing backwards. I’m still quite a newby to this whole WordPress thing and one of the first things that I noticed was the fact that the default installation of the WordPress (WP) content management system (CMS) seems to spawn duplicate content like the plague.

Like all egocentric hermits - I wanna be noticed on the web. So, its time to sort the duplication problem out.

I’ll tell you what I’ve done - you tell me if I done bad!

NOTE: About 3 months after this blog was posted I wrote an SEO performance update to tell you how it all went - so its worth reading that one too.

The Plan

  1. Work out how the WordPress code displays page <HEAD> tags
  2. Add either of two different <robots> tags into the header code of all the pages, see below.
  3. Upload the appropriate script, check it and wait for the verdict.

Telling the Robots where to go

  • I have decided to tell robots that they can only “index” and “follow links from” my home page, the list of my most recent blog posts, individual blog posts and WordPress Page Entries using:
    <meta name=”robots” content=”index,follow”>
  • For every other WP document, robots should not index and not follow:
    <meta name="robots" content="noindex,nofollow">

Why NoFollow?

I’m never quite sure whether we are meant to tell them to follow or not. But, I know that they don’t need to follow links on those pages. The robots can find all my blog posts via the “<<previous post” & “next post>>” links that appear at the top of every individual blog post. This means that I’m helping the robots to index every key page of my site with minimal effort. So, in the eyes of the robots, I’m “a good boy”.

This may not be the most successful SEO tactic today - lots of people say duplication can improve rankings. However, I hope it will earn me “early adopter” brownie points in the future.

The one thing that SEO spammers cannot manipulate is time

I may be horribly wrong, but, I’m hoping that if I’ve been “a good boy” for, say, 6 years. Then, that will have a lot more SEO weight that someone else who has been “a good boy” for 3.

The Process

Before we go any further, I should say that this is what I did. Whether it is right or wrong in principal, it worked for me. I hope it works for you - but there are no gaurantees.

  1. Work out how the WordPress code displays page <HEAD> tags

    Use a browser to view the source code of the WP site.

    In Firefox on Windows XP:

    • Go and visit the WP site that needs a bit of SEO lurve and attention.
    • Hover the mouse over an empty area on the page
    • Right-click > View Page Source
    • Click on the new window that opens up
    • Find some text within the <HEAD> tag and drag over it to select it. I selected “<meta http-equiv=“.
    • Press CTRL+C to copy it to the clipboard
    • Close the “Source Text Window” that had been opened by the right click earlier.
    • Use the Windows search tool to look for the documents that contain the text that we have just copied. Focus the search to “look in” just the “wp-content” folder of the WordPress scripts - otherwise hundreds of pages could appear. If the XP search dog is telling you that no pages contain that phrase: don’t believe it straight away. A wise man once said “Never believe an XP search dog - it can be stupidly wrong sometimes and deserves to go for a long walkies off a short pier“.
    • I found the script that serves up the <HEAD> tag for my Wordpress pages: /blog/wp-content/themes/default/header.php
    • Search that script for the copied code and you’re there.
  2. Add either of two different <robots> tags into the header code of all the pages

    Now that we know which part of the WP code serves up the <HEAD> tag, we can start to manipulate the code and add the robots meta tag. I’m going to add my robots tag just below the <meta http-equiv../> tag because its within the <head> tag and easy to find in the future.

    The fact that we are wanting to serve up either one tag or another gives us a clue that we should be using some kind of conditional logic in the script. Luckily the super folk of WP have provided a list of conditional tags that we can use to help us. This is what my code looks like now (I added the code that sits between the “start of added code…end of added code” comments):

    <head profile=”http://gmpg.org/xfn/11″>
    <meta http-equiv=”Content-Type” content=”<?php bloginfo(’html_type’); ?>; charset=<?php bloginfo(’charset’); ?>” />
    //start of added code…
    <?php
    if ( is_single() || is_home() || is_page() )
    {
    echo(’<meta name=”robots” content=”index,follow” />’);
    }
    else{
    echo(’<meta name=”robots” content=”noindex,nofollow” />’);
    }
    ?>

    //
    end of added code…
    //…other existing code….
    </head>

    You may wish to choose a different set of criteria to instruct the robots. Experiment by trying different conditional tags within the parenthesis.

    E.g To tell robots to index everything but the WP archives, you could try:

    //…other existing code….
    if ( is_archive() )
    {
    //this time we do “the no-no’s” first
    echo(’
    <meta name=”robots” content=”noindex,nofollow” />‘);
    }
    else{
    //index and follow everthing else
    echo(’
    <meta name=”robots” content=”index,follow” />‘);
    }

    //…you get the idea….

    (On a side note - If you have been following the instructions, the annoying XP search dog will be barking at you right now - so , feel free to kill it!)

  3. Upload the appropriate script, check it and wait for the verdict

    The next stage is to upload the code. Then, go through the various types of WP documents pages, right-clicking to view the source, and check to see if the script is performing as expected.Then, sit and wait for the search engines to sniff your pages!

One Response to “SEO - Optimising WordPress to avoid duplication issues”

  1. Eat My Business » Blog Archive » Update On: SEO - Optimizing WordPress to avoid duplication issues Says:

    […] Further to my blog entry that was posted on Friday, April 27th, 2007 about SEO optimizing Wordpress to avoid duplication issues. […]

Leave a Reply