The rare collaboration between search rivals Google, Yahoo and Microsoft over site maps has yielded its first result.
The trio have enhanced Sitemap, a protocol designed to simplify how webmasters and online publishers submit their sites' content for indexing in search engines.
Along with the improvements, the vendors also announced that IAC/InterActiveCorp's Ask.com will support the protocol, which thus gains backing from another major search engine operator. IBM also signed up to support the effort.
In November, Google, Yahoo and Microsoft agreed to support Sitemap, an open-source protocol based on XML (Extensible Markup Language).
A site map is a file that webmasters and publishers put on their sites to help the search engines' automated web crawlers properly index web pages. The Sitemap protocol aims to provide a standard format for site maps, which should simplify their creation by web publishers and their discovery and interpretation by search engines.
On Wednesday, the vendors announced that the Sitemap protocol, now in version 0.90, provides a uniform way of telling search index crawlers where site map files are located on a site.
All web crawlers recognise the robots.txt instruction, which tells crawlers not to index certain information, so now webmasters can indicate the location of their site map file within robots.txt files. Meanwhile, the Sitemap's official website is now available in 18 languages.
John Honeck, a mechanical engineer who runs several small sites and blogs in his spare time, including his personal blog, predicts the new feature will be helpful to webmasters. "Anything that is standardised is helpful for the webmaster. We can spend more time on our sites and less time worrying about setting up different accounts, verification processes, and submissions for all of the multitude of search engines out there," he said.
However, Honeck feels the vendors could clarify some points about the autodiscovery feature, such as how it will work in sites with multiple site maps. Privacy issues may also crop up, because pointing at the site map from the robots.txt file makes the information more easily accessible. "While not normally a problem, it could cause a security risk. As search engines can crawl your site more efficiently, so can scrapers and bad bots as well," wrote Honeck.
The Sitemap protocol was originally developed by Google and is offered under the terms of the Attribution-ShareAlike Creative Commons License.