Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh, good to know, thanks. Are you running relatively complete crawls of larger sites at similar rates to Googlebot?

I've definitely seen my share of robots.txt that give special permission to Googlebot, but maybe my corner of the web was unusually aggressive toward crawlers.



No, not complete crawls of any sites, we've aimed to get a wider coverage in place of deeper crawls. This means we have most sites indexed and are now gradually going deeper. We do encounter robots.txt blocks, but we don't see it as a major issue right now.

We have an API, in case of interest.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: