Hey peeps,
Everyone knows that one of the first things to do with your blog is index it with Google via Google Webmasters tools and the second thing is to create a sitemap.
Creating your sitemap isn’t a hard task when wonderful sites like xml-sitemaps exists. Simply type in your URL, punch the button and their program will create an .xml file for you. This is the file you submit to Google Sitemaps.
All done? Nope.
So many people run along, submit their sites, grab the .xml file and upload it, but honestly, this isn’t the most ideal way of doing things. Once you have your .xml file, rather open the file up and have a look at all the URLs inside that file. Google has to download this file from your site and then proceed to index the various pages – what happens if you have hundreds of URLs inside the text file? Google will take a while to download the file and on top of that, Google won’t have the time to crawl all the URLs as he is a busy chap.
Open the .xml file in a text editor and remove useless URLs, as well as that, spend some time arranging the most important links towards the top of the file. The smaller the file and the higher the important links, the better your site will get indexed.
I always do this and trust me, it makes a big difference!






I just dropped by to dispel a myth or two :-). The order of your URLs in the sitemap file is completely irrelevant. We process the file as it is – treating each URL with the respect it deserves, no matter where it stands in line. Also, while removing unimportant URLs from the sitemap file may make it easier for you to maintain the file (which may be a good thing), you can still keep them included, perhaps adjusting their relative priority appropriately. Since our Googlebot will still crawl your site even if you use a sitemap file, we will still discover them anyway — if you have them in the sitemap file with a low priority at least we will then know for sure that you think they’re not that hot.
There are two things I would recommend in exchange for the above:
1. Keep your sitemap file current. Don’t just make it one and forget about it.
2. Provide reasonable information about the URLs or just list the URLs without it. If you don’t know the last modification date, change frequency or a priority for your URLs, just leave those values out (don’t use “default” values).
Hi John, thanks for popping in with some fantastic information. I don’t totally agree with everything you have said, as my post is based strictly upon experience and testing, but you do raise a number of good points.
When it came to removing URLs, this was implied for sites with literally hundreds of URLs and also URLs which are not friendly, such as some of the URLs Joomla creates. I do agree that leaving most URLs in the list is a wise idea, I would be a fool to think otherwise, I just feel that not all the URLs are required.
Adjusting priority is important, I couldn’t agree with you any more! I normally try and update my sitemap as often as possible, or by using a WordPress sitemap plugin, but at this point, the amount of Organic Google traffic I get is way more than I really need, so I don’t need to get Google over and peaking too much more ;) I’m actually focusing on Yahoo and MSN at the moment, as they have slightly different techniques and engines, and in my experience, I’ve found them a little harder to index with nicely, perhaps you could share some knowledge on this?
I really appreciate the wonderful opinions and strategies you have laid down here and hope to see you frequent this site a little more often :)
Hi Chris – I see what you mean about those URLs. You’re right, it makes little sense to list multiple URLs for the same content in a sitemap file (depending on the website, that may include URLs with session IDs or other special parameters). However, instead of just removing them all manually, I would work on adjusting your server setup so that these parameters are not shown to search engine crawlers (and your sitemap generator). By doing that you’re not only making sure that your sitemap is clean, but also that all search engines, regardless of whether or not they use sitemaps, will see your clean URLs.
Ye, I must agree, pages which are protected from sessions are not needed in sitemaps, and many websites have a great deal of protected pages, it just makes the sitemap file bulky and large.
The problem when it comes to adjusting servers is that most people don’t have access to server settings and therefore is just isn’t a feasible option, hence my rather plain and straight forward post – when I write on semi technical subjects, I prefer to keep them rather simple and straight forward, so new bloggers and web fans can execute them without a big hassle.
I had a great of your site and will most definitely be back again to do some more reading, some darn useful information you have over there :)