Thursday, May 26, 2016

How To Ensure Optimal Crawling Of Your Website?


It is one thing to crte a website and put up some content, but quite another to get it noticed by Google. Often, the more content you have, the higher your of crawled and indexed pages in srch engines. But that is not always the case. If the crawling process is not optimal, srch engines might miss out on some of your content. Today, we have some guidelines for you from Google, explaining which fields in sitemaps are important, when to use XML Sitemaps and RSS/Atom feeds, and how to optimize then for Google.

XML Sitemaps or RSS feeds?
The first question that you could ask is, which to use; XML Sitemaps or RSS/Atom feeds? Should you use RSS/Atom feeds alongside XML Sitemaps? XML Sitemaps are an indispensable part of your site, and they describe a whole set of URLs within it. On the other hand, RSS/Atom feeds describe the most recent changes.
The problem with XML Sitemaps is, they contain complete site information, and hence are much larger than RSS feeds. Ergo they're also downloaded less frequently. So it's not a question of why, and rather why not use both these formats? ch has its own use, and complements the other.
XML sitemaps give Google information about all the pages on your site, while RSS/Atom feeds let Google know what has been most recently updated on your site. Google also adds that “submitting sitemaps or feeds does not guarantee the indexing of those URLs.”
Sitemap and RSS feeds best practices
In order to optimize the crawl process, you should use XML Sitemaps along with RSS/Atom feeds. Here are some best practices for them from Google.The two most important pieces of information for Google are the URL itself and its last modifiion time.Only include URLs that can be fetched by Googlebot (ie, don’t include URLs blocked by robots.txt).Only include canonical URLs.Specify a last modifiion time for ch URL in an XML sitemap and RSS/Atom feedFor a single XML sitemap, update it at lst once a day and ping Google ch time.For a set of XML sitemaps, maximize the of URLs in ch XML sitemap. The limit is 50,000 URLs or a maximum size of 10MB uncompressed. Ping Google when ch XML sitemap is updated.When a new page is added or an existing page mningfully changed, add the URL and the modifiion time to the RSS/Atom feed.In order for Google to not miss updates, the RSS/Atom feed should have all updates in it since at lst the last time Google downloaded it. The best way to achieve this is by using PubSubHubbub.Good luck getting your webpages crawled quickly :)

No comments:

Post a Comment