I have a web application that generates dynamic pages from our db. For simplicity, let say the each page generated is about an animal. For every animal(page) you can link to that animals parents, that animals children, that animals owners. And for each of these links, you can continue this cycle for every animal listed. We have ~ million animal records.
Until recently, we never allow Google to crawl these pages. 3 weeks ago, GoogleBot picked up these pages and now appears to be relentlessly crawling them. As mentioned above, there are potentially millions of links to follow.
I'm watching are DB being hit pretty hard, but we are handling the traffic..
I'm kind of waiting for the crawling to hit a peek and stop. But everyday it keeps growing and growing.
Will this type of crawling cause good or bad results? Should I prevent this type of "deep" linking? I was under the impression that having all these pages indexed would be good thing, but # of hits we are getting from GoogleBot is starting to concern me.
The way I see it, I have 3 options:
A) Prevent Indexing of our dynamic pages
Mark links within our dynamic pages with the nofollow attr
- Allow this continue if we can handle the traffic / set crawl rate and let this continue