Announcement

Collapse
No announcement yet.

Russian Yandex Crawler

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Russian Yandex Crawler


    Yandex Crawler

    The Yandex Russian crawler is trying to crawl me like this...

    https://vbulletin.org/forum/index.php/read/72116

    Vbulletin.org is showing the same effect as does my board.

    The full URL looks like this on vbulletin.com ...

    https://vbulletin.com/forum/index.ph...ve/archive/arc

    Code:
    https://vbulletin.com/forum/index.php/read/72116&expand/archive/chat/chat/chat/chat/chat/archive/archive/chat/chat/archive/chat/chat/archive/archive/archive/archive/archive/archive/archive/archive/archive/archive/archive/archive/archive/archive/archive/archive/archive/arc
    Any clue about what they are attempting to accomplish?

    Obviously vBulletin 5 has a solution to this.

    IPs
    141.8.43.172
    52.90.254.43
    100.43.90.161
    5.255.250.121





    Last edited by oldengine; Mon 23rd Apr '18, 10:38pm.

  • #2
    Originally posted by oldengine View Post
    Yandex Crawler

    The Yandex Russian crawler is trying to crawl me like this...

    https://vbulletin.com/forum/index.php/read/72116

    Vbulletin.com is showing the same effect as does my board.

    The full URL looks like this...

    https://vbulletin.com/forum/index.ph...ve/archive/arc
    Code:

    https://vbulletin.com/forum/index.ph...ve/archive/arc
    Any clue about what they are attempting to accomplish?

    Any way to halt this behavior?

    IPs
    141.8.43.172
    52.90.254.43
    100.43.90.161
    5.255.250.121
    Isn't Yandex a Russian Search Engine? I think, if so, they are simply crawling your site to include it in their search listing. I'm not sure whether Yandex crawlers respect Robot.txt. You may want to search online to see whether anyone has successfully made a Robot.txt which commands Yandex not to crawl the site. If not, you'll have to ban by IP.

    Comment


    • #3
      The links in what you quoted are all messed up. Try the links in my post again as vbulletin.org displays the problem the same way as on my site.

      Comment


      • #4
        Originally posted by oldengine View Post
        The links in what you quoted are all messed up. Try the links in my post again as vbulletin.org displays the problem the same way as on my site.
        If Yandex crawlers respond to robots.txt you can tell it to exclude certain search urls paths like:

        Disallow: /forum/index.php/read/72116&expand/archive/chat/*

        You'll need to put the appropriate url path in your robots.txt.

        I really wouldn't worry about it if I were you. However, the robots.txt can be used to disallow certain url paths or directories to crawlers which actually read it.

        Comment

        widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
        Working...
        X