Announcement

Collapse
No announcement yet.

Prevent Robots from using bandwidth

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    I've got my forums on a subdomain (http://forums.devbox.net) but when I FTP to my site, there's just a directory "forums". Do I put the robots.txt in my forums directory because its like its own domain, or do I put it in my root directory and edit the paths?

    /////edit
    Doh! Nevermind, it says right in the first post! Stupid me...
    Last edited by Chroder; Thu 16 Oct '03, 7:17pm.

    Comment


    • #32
      Is the text on the original post the same for vb3 Gamma or would I need to make some modifications? Thank you.
      Marc

      U.S. Politics Online

      Comment


      • #33
        I used the file on the first page and it helped prevent Google from getting the "permission denied" pages most of the time, but it was still getting it some. I looked at the vb3 files and added a couple and haven't seen Google get the "permission denied" page again yet.

        User-agent: *
        Disallow: /forums/attachment.php
        Disallow: /forums/avatar.php
        Disallow: /forums/editpost.php
        Disallow: /forums/member.php
        Disallow: /forums/member2.php
        Disallow: /forums/misc.php
        Disallow: /forums/moderator.php
        Disallow: /forums/newreply.php
        Disallow: /forums/newthread.php
        Disallow: /forums/online.php
        Disallow: /forums/poll.php
        Disallow: /forums/postings.php
        Disallow: /forums/printthread.php
        Disallow: /forums/private.php
        Disallow: /forums/private2.php
        Disallow: /forums/report.php
        Disallow: /forums/search.php
        Disallow: /forums/sendtofriend.php
        Disallow: /forums/threadrate.php
        Disallow: /forums/usercp.php
        Disallow: /forums/admin/
        Disallow: /forums/images/
        Disallow: /forums/mod/
        Disallow: /forums/sendmessage.php
        Disallow: /forums/register.php
        Disallow: /forums/subscription.php
        Last edited by marcjd; Thu 11 Dec '03, 5:21pm. Reason: adding subscription.php
        Marc

        U.S. Politics Online

        Comment


        • #34
          Excellent list - will try this out.

          EDIT: admin and mod folders need renaming for vb3 - this should be good:

          Code:
          User-agent: *
          Disallow: /attachment.php
          Disallow: /avatar.php
          Disallow: /editpost.php
          Disallow: /member.php
          Disallow: /member2.php
          Disallow: /misc.php
          Disallow: /moderator.php
          Disallow: /newreply.php
          Disallow: /newthread.php
          Disallow: /online.php
          Disallow: /poll.php
          Disallow: /postings.php
          Disallow: /printthread.php
          Disallow: /private.php
          Disallow: /private2.php
          Disallow: /report.php
          Disallow: /search.php
          Disallow: /sendtofriend.php
          Disallow: /threadrate.php
          Disallow: /usercp.php
          Disallow: /admincp/
          Disallow: /modcp/
          Disallow: /images/
          Disallow: /sendmessage.php
          Disallow: /register.php
          Disallow: /subscription.php 

          Last edited by I, Brian; Thu 1 Jan '04, 2:19am.

          Comment


          • #35
            my forum is at www.exero.net/forum/
            so i use this

            Code:
            User-agent: *
            Disallow: /
            Disallow: /forum/
            Disallow: /forum/admincp/
            Disallow: /forum/modcp/
            Disallow: /forum/images/
            Disallow: /forum/includes/
            Disallow: /forum/archive/
            Disallow: /forum/clientscript/
            Disallow: /forum/cpstyles/
            Disallow: /forum/customavatars/
            Disallow: /forum/install/
            Disallow: /forum/subscriptions/

            Comment


            • #36
              Originally posted by Scott MacVicar
              robots spider you webpage and gather the documents, they index these so when you go to a search engine you can type in a word and it matches your site if it found the word.

              Spiders following links on web pages to other pages, but they also follow images which is a bad thing sometimes as it uses your bandwidth especially if you have a big board and its getting spidered once a day by many search engines.
              Why are so many of them visiting my forums at once? Right now I have about 44 members logged in and about 97 Inktomi spiders...!

              Comment


              • #37
                Originally posted by Scott MacVicar
                Note: The majority of spiders check for these not all do.
                Sadly, it appears that Google is one that does not. I added those lines to my robots.txt and Google is still spidering things like memberlists, reply forms, etc. Has been now for days - usually 10 to 15 seessions at a time for a goodly portion of the day.

                Comment


                • #38
                  All search engines fetch it just before they start a spidering session.

                  Google should behave next time it spiders at the end of the month.

                  In order to save bandwidth Googlebot only downloads the robots.txt file once a day or whenever we have fetched many pages from the server. So, it may take a while for Googlebot to learn of any changes that might have been made to your robots.txt file. Also, Googlebot is distributed on several machines. Each of these keeps its own record of your robots.txt file. Finally, you may want to check that your syntax is correct against the standard at: http://www.robotstxt.org/wc/norobots.html. If there still seems to be a problem, please let us know, and we will correct it.
                  http://www.google.com/intl/en/webmasters/3.html
                  B3
                  Scott MacVicar

                  My Blog | Twitter

                  Comment


                  • #39
                    updated robots.txt list

                    Hello:

                    Could someone tell me which of the robots.txt list above to use or show me a list to use for vB3.0RC4 to reduce bandwidth, I do want my forum spidered. The path to my forums home page is (.com/index.php?)

                    I do have the archive set up & I do want this searched.

                    I'm only asking because some of these list are old & I do not know which one to use for vB3.0RC4 for best results in saving bandwidth? Anybody?
                    Last edited by DJ5A; Wed 25 Feb '04, 8:06am.
                    Thank You for Your Time!!!!!
                    DJ5A

                    Comment


                    • #40
                      Hello Everybody:

                      If I want my forum (vB3.0RC4) searched by the Spiders & Save bandwidth, I should...

                      Disallow all files & folders except this folder...
                      /archive/

                      Is this correct?
                      Thank You for Your Time!!!!!
                      DJ5A

                      Comment


                      • #41
                        Could Scott be so nice to update the first post with a .txt file for version 2 and version 3 of vb ? - there are obviously some directory and file changes.

                        Comment


                        • #42
                          13 days & no Reply?

                          Hello Anybody Out Thar?

                          I'm very patient... 13 days & no Reply?

                          If I want my forum (vB3.0 RC4) searched by the Spiders & Save bandwidth, everything is disallowed except the Archive folder, Is this Correct?

                          User-agent: *
                          Disallow: /admincp/
                          Disallow: /cgi-bin/
                          Disallow: /clientscript/
                          Disallow: /cpstyles/
                          Disallow: /customavatars/
                          Disallow: /images/
                          Disallow: /includes/
                          Disallow: /install/
                          Disallow: /modcp/
                          Disallow: /subscriptions/

                          Disallow: /announcement.php
                          Disallow: /attachment.php
                          Disallow: /calendar.php
                          Disallow: /clear.gif
                          Disallow: /cron.php
                          Disallow: /editpost.php
                          Disallow: /external.php
                          Disallow: /faq.php
                          Disallow: /forumdisplay.php
                          Disallow: /global.php
                          Disallow: /image.php
                          Disallow: /joinrequest.php
                          Disallow: /login.php
                          Disallow: /member.php
                          Disallow: /memberlist.php
                          Disallow: /misc.php
                          Disallow: /moderator.php
                          Disallow: /newattachment.php
                          Disallow: /newreply.php
                          Disallow: /newthread.php
                          Disallow: /online.php
                          Disallow: /poll.php
                          Disallow: /postings.php
                          Disallow: /printthread.php
                          Disallow: /private.php
                          Disallow: /profile.php
                          Disallow: /register.php
                          Disallow: /report.php
                          Disallow: /reputation.php
                          Disallow: /search.php
                          Disallow: /sendmessage.php
                          Disallow: /showgroups.php
                          Disallow: /showpost.php
                          Disallow: /showthread.php
                          Disallow: /subscription.php
                          Disallow: /subscriptions.php
                          Disallow: /threadrate.php
                          Disallow: /usercp.php
                          Disallow: /usernote.php
                          Last edited by DJ5A; Sun 7 Mar '04, 2:30pm.
                          Thank You for Your Time!!!!!
                          DJ5A

                          Comment


                          • #43
                            People may not have responded because the correct code is already on this page. Look at I, Brian's...which corrected the one I did for the modcp and admincp directories.

                            On a side note, it was working fine for me for quite a while then Google started going to no permission pages again. Maybe they are mad at me...lol.
                            Marc

                            U.S. Politics Online

                            Comment


                            • #44
                              Originally posted by Floris
                              Could Scott be so nice to update the first post with a .txt file for version 2 and version 3 of vb ? - there are obviously some directory and file changes.
                              Yes, I would like to see an updated version as well. Maybe this can be put somewhere as a sticky or an official 'How do I'?

                              I am looking for an updated VB3 robots.txt that really blockes as much as possible.
                              How much do you love XenForo?

                              Comment


                              • #45
                                Originally posted by Grover
                                Yes, I would like to see an updated version as well. Maybe this can be put somewhere as a sticky or an official 'How do I'?

                                I am looking for an updated VB3 robots.txt that really blockes as much as possible.
                                I'll proudly join this bandwagon.

                                I, too, am ever-so-patiently waiting for a list. (Why? ... You may ask. Because I am a moron and couldn't compile one on my own if my life depended on it. )

                                Thank You.
                                Mechanical Mind.

                                Comment

                                widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
                                Working...
                                X