Written by Giles Bennett
In amongst the recommendations for a health check of your Magento site contained in our recent post on the subject, checking your sitemap snuck in at number 4. This post expands on the questions of 'what is a sitemap' and 'why is it important to have a sitemap for a Magento store'.
What is a sitemap?
A sitemap is an XML file which sits on your server and should list all the addresses, or URLs, that you want search engines to index. Whilst you can wait for search engine spiders to crawl over the site and hope that they'll pick up on all the pages as they do so, giving them a sitemap gives them a definitive list of pages on the site, so rather than leaving them to find them for themselves.
How do I know if I have a sitemap
The easiest thing to do is to check - if your site's web address is www.yoursite.com, then your sitemap should be at www.yoursite.com/sitemap.xml. If not, then the chances are that you haven't got one. When you access the sitemap, be sure also to see if there's a date or time on it showing when it was generated - unless your site never changes,having an out of date sitemap is almost as bad as having no sitemap, so you should take care to ensure that your sitemap is automatically updated on a daily basis.
How do I setup a site map in a Magento store?
Magento comes with ability to generate sitemaps built-in, but you do have to set it up to do so. Under the Catalogue menu in the admin panel, you'll see an entry called 'Google Sitemap' (although sitemaps aren't limited to being for the benefit of Google!). Clicking on that will take you to a further screen where, if you have a sitemap. its details will be shown.
If you've not got a sitemap, then it's very simple to set one up. Click on the Add Sitemap button, then on the following screen enter, in the Filename box, 'sitemap.xml' and in the Path box enter '/' - as shown below. Then click the 'Save and Generate' button. Assuming there are no issues with writing the file to your server, the sitemap should be generated automatically, and you can find it straight away online at www.yoursite.com/sitemap.xml.
How do I arrange automatic sitemap updating in Magento?
Still in the admin panel, under the System menu click on Configuration and then, on the next page, choose 'Google Sitemap' from the left hand side. On the right hand side, there are four sections - the bottom section is called 'Generation Settings' and in this section you need to do two things. Firstly, change the dropdown box alongside 'Enabled' from 'No' to 'Yes'. Then, as a safety measure, enter your email address in the box alongside 'Error Email Recipient' - this will notify you by email if there is an error in generating the sitemap.
Finally, click 'Save Config' and you're done - your sitemap will now generate itself daily.
What is a sitemap index?
A sitemap index isn't a sitemap, in that it doesn't contain a list of pages, but instead it contains a list of sitemaps. Using Wordpress as an example, rather than have everything in one sitemap, instead one can create separate sitemaps for blog posts, for pages, categories, tags, and so on. The sitemap index would still be at www.yoursite.com/sitemap.xml, but instead of containing a list of all the pages on the site, it would contain within it the addresses of all the separate sitemaps. The search engine looks at sitemap.xml, gets from it the list of the other sitemaps that need to be looked at, then looks at each of those in turn.
Is a sitemap the same as a robots file?
Short answer, no. The only cross-over is that by convention a robots.txt file will tend to include within it the URL of the main sitemap, which helps search engines find the sitemap (assuming it's not at www.yoursite.com/sitemap.xml, which is where search engines tend to assume it will be). More information on what a robots.txt file is and does follows in our next blog post.