I have a 302 redirect pointing to www. but Googlebot keeps crawling non-www URLs

Question

Do you know if it is possible to force the robots crawl on www.domaine.com and not domaine.com ? In my case, I have a web app that has enabled cached urls with prerender.io (to view the HTML code), but only on www.

So, when the robots crawl on domaine.com, it has no data.

The redirection is automatic (domaine.com> http://www.domaine.com) on Nginx, but no results.

I said that my on my sitemap, urls have all www.

My Nginx redirect :

server {
  listen                *:80;

  server_name           stephane-richin.fr;

  location / {

    if ($http_host ~ "^([^\.]+)\.([^\.]+)$"){
      rewrite ^/(.*) http://www.stephane-richin.fr/$1 redirect;
    }

  }
}

Do you have an idea ?

Thank you !


Show source
| seo   | web-crawler   | google-crawlers   | domcrawler   2016-09-21 11:09 2 Answers

Answers to I have a 302 redirect pointing to www. but Googlebot keeps crawling non-www URLs ( 2 )

  1. 2016-09-21 11:09

    Could you have a robots.txt file with

    User-agent: *
    Disallow: /
    

    on domaine.com and a different one with

    User-agent: *
    Disallow:
    

    on www.domaine.com?

  2. 2016-09-21 12:09

    If you submitted a sitemap with the correct URLs a week ago, it seems strange that the Google keeps requesting the old ones.

    Anyway - you’re sending the wrong status code in your non-www to www redirect. You are sending a 302 but should be sending a 301. Philippe explains the difference in this answer:

    Status 301 means that the resource (page) is moved permanently to a new location. The client/browser should not attempt to request the original location but use the new location from now on.

    Status 302 means that the resource is temporarily located somewhere else, and the client/browser should continue requesting the original url.

Leave a reply to - I have a 302 redirect pointing to www. but Googlebot keeps crawling non-www URLs

◀ Go back