Multilingual website and bot detection


I have a website where I implement multilingual.

I divide my languages per subdomains. // root domain => neutral language for bots

On the subdomains, if a language cookie was not set, I use the subdomain as language code.

On the primary domain (www), if a language cookie was not set, then :

  • if it's a bot, I use neutral language
  • if it's not a bot, I detect the user language using the "accept-language" header.

How to detect safely if it is a robot? I read old topics on the matter but people simply used the "accept-language" because bots didn't send this header, however, to date, google sends this header...

Is it safer to detect if it's a bot, or inverse, to detect if it's a web browser? Because if the bot is not detected, it's the website that will be indexed in wrong language.

Ideas ?

Show source
| php   | seo   | node.js   | web   | web-crawler   2016-09-22 18:09 1 Answers

Answers ( 1 )

  1. 2016-09-22 18:09

    Assuming you're using PhP, you can request the HTTP_USER_AGENTand see if the user agent is 'googlebot'.

    if(strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot"))
        // what to do

    Here's the link to a question (and the example which I pulled from it).

    how to detect search engine bots with php?

◀ Go back