Fine-tuning search engine visits to your website

From: Aron Roberts <aron_at_socrates.berkeley.edu>
Date: Thu, 16 Nov 2006 15:56:24 -0800

   If you manage a campus website - particularly a high-visibility
site, or a site with frequently changing content - and want more
control over how your site's individual pages are indexed by search
engines, this became slightly easier today.

   The vendors behind three of the biggest Web search engines, Google,
Yahoo, and Microsoft, have agreed to accept the same mechanism for
being told how you'd like their robots to crawl your site's pages.
These three companies have also encouraged other search engine
vendors to follow their lead. You can find an overview at
<http://www.techcrunch.com/2006/11/15/google-yahoo-and-microsoft-agree-to-standard-sitemaps-protocol/>.

   This mechanism involves placing via a sitemap XML file on your site
- or alternately, an index file pointing to multiple such files - and
submitting the URL of the sitemap or index file individually to each
search engine supporting this mechanism.

   The primary benefit of using this mechanism is that you may be able
to fine tune the visits by search engine robots to your site - so
they can more frequently visit your dynamic pages, whose content is
updated more often, or pages which you have marked as having higher
priority - although there can be no guarantee of that. Another
potential advantage is that you can help make sure that the search
engines can find all of the pages you'd like to have indexed, by
including them in your sitemap file(s).

   The format of the sitemap files and sitemap index files is described at:

   http://www.sitemaps.org

   Note that this is an XML sitemap file format intended for reading
by machines, not an HTML format suitable for visiting by humans.

   A quick Google search revealed some tools (untried by me) that
purport to help you generate the XML-based sitemap file(s):

http://www.google.com/webmasters/sitemaps (online, requires free
Google account)
http://www.auditmypc.com/free-sitemap-generator.asp (Java)
http://www.sitemapbuilder.net/ (online and Windows versions)

   To realize the most benefit from this, you'll need to update the
sitemap file(s) as your site's pages are added or moved, and thus you
may want to schedule automatic, scripted updates to your sitemap
file(s).

Aron Roberts
Information Services and Technology
-----------------------------------------------------------------------
The following was automatically added to this message by the list server:

Webnet information is available at http://webnet.berkeley.edu. Email sent to this list is archived at http://ls.berkeley.edu/mail/webnet/ . This archive is open to the general public and browsable by search engine spiders, email-address harvesting robots, your bosses, etc.
Received on Thu Nov 16 2006 - 16:00:01 PST

This archive was generated by hypermail 2.2.0 : Thu Nov 16 2006 - 16:00:02 PST