Announcement

Collapse
No announcement yet.

Fast-Webcrawler??

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fast-Webcrawler??

    Does anyone know what Fast-Webcrawler is? It's showing up as a spider on my web logs, and in a day and a half has made (according to the logs) over 23,000 hits on my site -- could this be why my bandwidth is going through the roof?

    It's not something I want, obviously, so any tips on stopping spiders in their tracks?

  • #2
    Check out how to add a robots.txt to your root dir to stop bots.
    --filburt1, vBulletin.org/vBulletinTemplates.com moderator
    Web Design Forums.net: vB Board of the Month
    vBulletin Mail System (vBMS): webmail for your forum users

    Comment


    • #3
      I've just added one, thanks But I understand webcrawler ignores robots.txt files? Or at least that's what I'm understanding after reading some usenet posts...

      Here's hoping it's blocked now. Cheers!

      Comment


      • #4
        http://fast.no/support/

        That may help you, also, if the previous thing didn't.
        - Andrew Pfeifer

        Comment


        • #5
          I have put in the robots.txt file as well as the NOINDEX meta tag, which has slowed it a little - but it's still made almost 10,000 hits since yesterday

          Comment


          • #6
            Hi,
            maybe checking the netblock of the IP (e.g. at www.all-nettools.com or similar tools) the bot come from and contacting that ISP helps.
            good luck anyway,
            -Tom
            http://www.mcseboard.de/images/buttons/lastpost.gif www.MCSEboard.de
            German Windows Server & IT Pro Community dedicated to Windows Client & Server Systems. MVPs inside

            Comment


            • #7
              You can also try asking your host to block requests from that source. Better yet, add a .htaccess file that bans that IP.
              --filburt1, vBulletin.org/vBulletinTemplates.com moderator
              Web Design Forums.net: vB Board of the Month
              vBulletin Mail System (vBMS): webmail for your forum users

              Comment


              • #8
                Originally posted by filburt1
                You can also try asking your host to block requests from that source. Better yet, add a .htaccess file that bans that IP.
                What format would the .htaccess file take?

                Comment


                • #9
                  Originally posted by MarkB


                  What format would the .htaccess file take?
                  Something like this:
                  Code:
                  <Limit GET>
                  order allow,deny
                  deny from 123.456.789.0
                  deny from 123.45.67
                  allow from all
                  </Limit>

                  Comment


                  • #10
                    Thanks - did that, and access to my forums came up with a 500 Error

                    Comment


                    • #11
                      500 is Internal Server Error You didn't deny yourself acccess, did you?
                      --filburt1, vBulletin.org/vBulletinTemplates.com moderator
                      Web Design Forums.net: vB Board of the Month
                      vBulletin Mail System (vBMS): webmail for your forum users

                      Comment


                      • #12
                        Originally posted by Ian

                        Code:
                        <Limit GET>
                        order allow,deny
                        deny from 123.456.789.0
                        deny from 123.45.67
                        allow from all
                        </Limit>
                        What IPs should you have listed to deny?

                        Comment


                        • #13
                          Originally posted by filburt1
                          500 is Internal Server Error You didn't deny yourself acccess, did you?
                          hehehe - no

                          This is the log entry for FAST-WebCrawler:

                          Code:
                          66.77.73.69 - - [28/Apr/2002:20:23:56 -0400] "GET /forum/showthread.php?goto=lastpost&threadid=21745 HTTP/1.0" 302 0 "-" "FAST-WebCrawler/3.5 (atw-crawler at fast dot no; http://fast.no/support.php?c=faqs/crawler)"
                          So I put 66.77.73 in the htaccess file.

                          I also have robots.txt with the correct deny info as per their FAQ...

                          Comment


                          • #14
                            This is the htaccess file:

                            Code:
                            <Limit GET>
                            order allow,deny
                            deny from 66.77.73
                            allow from all
                            </Limit>

                            Comment


                            • #15
                              Problem solved! I had a clashing <directory> entry in my httpd.conf file - fixed that, and the bot hasn't hit my site for the last 30 minutes (and it was there 24/7 seemingly!).

                              Ahh... The Days Of Our Server continues

                              Comment

                              widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
                              Working...
                              X