posted time Created time: 2015-06-07 posted time Last updated time:

Invite google crawler by sitemap and Rss feed

Inviting google's crawler is essential factor for SEO. The pages are crawled again and again, and they turn over the index.

Fetch as Google for the new page not for old page

Google provides Fetch as Google on the search console. It can immediately invite crawler and index the page, soon.

By using it, the page will be indexed in a few minutes. But it seems to generate index for immediate algorithm.

Therefore, it seem not to be suitable for pages which are already indexed.

The page which is fetched will indexed very soon and the cache of the page also updated, when we use it for already indexed page. And it appears on the search result of time ranged query.

But it will disappears in a few days. I guess it has possibility that fetch as google get the index stage to the first one. Therefore I recently do not use it even if the indexed page changed.

We have to prepare other methods to invite crawler.

Use sitemap xml and Rss feed

The sitemap xml file and Rss feed is effective way to invite google crawler. Both of the file has timestamp of the updated time.

Register both sitemap and Rss to the Search Console

On the Search Console, both sitemap and Rss feed can be registered in the sitemap section.

Register sitemap and Rss on Search Console

By adding both, we can notify new blog articles and updated web page.

The Rss lists the new pages, and sitemap file lists all of the web pages with updated time (date).

When you update or added any contents, check all of the sitemaps and click "Resubmit" button.

Sitemap xml file with updated time

The sitemap xml file has following fields in the xml element.

  • loc
  • lastmod

Currently google uses these 2 element.

In this CMS, the last modified date is when the content changes. For exmaple, following situation, the date will be changes.

  • The body content was updated
  • The links of the navigation or recommendation is changed
  • The template is changed

Simply speaking, any part of the Html code changes, the date is updated.

The CMS checks the Html code on publishing the web page, if there are any change or not. If it is changed, update the last modified timestamp.

Rss with timestamp when the body contents was updated

Rss is to notify that a new article is added. It has about 10 records of new blog posts or articles. Therefore it can not include all of the web pages like sitemap xml file.

But it can notify new pages with high priority.

In this CMS, the update time of the Rss feed's articles is when the body content is updated. You can control it with content editor.

Updated time

The CMS uses Last Modified Time of each articles. And it selects latest ones and put them to the Rss feed.

Use pubsubhubbub with the Rss feed

The pubsubhubbub is to notify the change of Rss feeds. It is like ping of the blog updates.

By doing that, crawler come to fetch the Rss feed soon.

Rss is not effective for new websites

If your website has enough contents and history, and approved by Google, the Rss and pubsubhubbub is very effective way. But your website is not admitted, it does not come soon.

Therefore Fetch as google is effective way to invite crawler to the new pages. But I think you have to use it for new pages.

For the updated indexed pages, sitemap xml is the most effective method. The crawler watches the updated time and set high priority to the recently updated pages. It takes a few days, but crawler comes properly.

In any case, it is impossible to make the crawler come many times to start up sites. Therefore you have to concentrate on writing new pages. In the near future, the crawler will come if the sitemap is correctly set.

But I recommend you to do fetch as google for new pages. That is because the page is copied by other person, or syndicated by other websites, the earliest indexed page can be the original owner of the contents.

Actually the Google checks the original owner by link status, but you have to avoid the risk by rapidly being indexed.

Go to Top