Archive for the ‘SEO’ Category

Not Getting Indexed by Live/MSN? Try This…

Friday, December 28th, 2007

I tried everything I know of to get one of my sites (dougthecook.com) back into Live search index. Over a period of a few days it degraded to being delisted. Anyway, after a few weeks of being unlisted, a friend of mine suggested I add the RSS feed of dougthecook to my “My MSN” account. A few days later, there is dougthecook.com - back in the index.

Doug

What Good are Google Sitemaps?

Thursday, March 15th, 2007

Don’t get confused: a regular sitemap (links on a webpage which summarize your website) is not a Google sitemap.

A Google Sitemap file will inform Google of the URLs on your site, including the dates when you last changed them. If and when Google is ready to crawl your site, it will take this information into account and use it to optimize it’s visit. If it already knows your site a bit and you signal that you have changed one of those pages (added a new link to it or just fixed a misspelling), then it will go have a look as soon as your site is up again.

Microsoft and Yahoo have joined Google to adopt the sitemap as a standard. sitemaps.org has an FAQ on sitemaps along with the XML schema and a few other tidbits.

Publishing a sitemap will not get your site crawled more often; it will just optimize the search bot visit when it does crawl your site. Likewise, it will *not* get more of your site crawled, but it might concentrate on the more important parts. When it does crawl your pages, it will process them regularly, meaning that any content you have on it will usually get used for web search.

How to Create a Sitemap

Google Webmaster Tools offers a python script that generates a sitemap.

Another popular tool is the GSiteCrawler: it will crawl your site, take a look at all of your pages (and yes, it will make counters count, if your gallery program counts all visits) and use that information to make a Google Sitemap file. In a sense, you are looking at your site with the GSiteCrawler and taking that information so that Google does not have to do as much work (and can concentrate on the important parts).

One advantage of running a sitemap crawler is that if it gets stuck on your site, so will Google and other search indexers.

There are plug-ins for blogs, such as WordPress, that will generate a new sitemap every time a blog is created or modified and notify Google of the change.

Google sitemaps are there to help the search bot; not improve your website’s search result placement.

Doug

Control over Using ODP Information

Saturday, December 2nd, 2006

For site owners included in ODP (Open Directory Project) the information is displayed as a default on search results. Google now gives webmasters the option to add a simple meta tag to their webpages to tell the search engine not to display their ODP (Open Directory Project) information:

Insert the following anywhere between the header tags:

<meta name="GOOGLEBOT" content="NOODP" />

This works only for the google bot, so not every search engine who uses the ODP will immediately follow suit. Here’s the code that applies to all search engines who choose to use it:

<meta name="ROBOTS" content="NOODP" />

I understand about 90% of the site descriptions contained in the ODP is outdated so using the out-of-date information is detrimental to the website owner.

Doug

Microsoft Joins Bot Identification Bandwagon

Friday, December 1st, 2006

Following Google’s footsteps, MSNBot, the Microsoft-sanctioned bot that crawls your website, can be identified. Live Search blog has published a way to see if the MSNBot is for real or not. A reverse DNS lookup is used to see if the bot’s IP address is the correct registered name. Then the host name is checked by doing a DNS lookup to see if it is coming from Live Search.

Doug

Introducing sitemaps.org

Sunday, November 19th, 2006

Yahoo and MSN are jumping on board Google’s sitemap idea. At sitemaps.org the first paragraph describes the use of sitemaps:

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

The sitemap protocol is described on the site along with a comprehensive FAQ.
The upshot of the project is only one sitemap needs to be published for all search engines (if the others gravitate to its use).

Remember that using the sitemap does not guarantee web pages will be included in an index.

Doug