Announcement

Collapse
No announcement yet.

Large increase in the number of indexed pages

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • [Forum] Large increase in the number of indexed pages

    I have ran into a bit of a problem with my forum. I recently check my Google webmaster tools and found that the number of pages indexed increased from 28,000 to 567,000 within 3 months. I traced the problem to the forums that I installed on the site but that is far as I got. My forum url is: http://smallbiztrends.com/members/.

  • #2
    I suggest removing the Archive, it's not needed.

    Comment


    • #3
      That is a good suggestion. I plan on getting rid of it. That could cause some really bad duplicate content issues and direct search traffic to a place where the user can not respond to the post without clicking another link.

      I did a little digging and found that the members section is my problem. If you do a site search and drill down to a specific user, they have a very large number of search results. Example: https://www.google.com/search?q=site...350-Garnetjo67. The bad part is that some of these pages are blank. In the example I provided, this user has no friends, no photos, and no activity but he has 143+ pages of blank activity pages indexed and 161+ pages of blank friends activity.

      My solution for all of this is going to have to be blocking the search engines from these 2 section until vBulletin handles these sections a little better.

      Comment


      • #4
        Nice detective work lelandmcfarland.

        On my forum, I don't have features like friends, photos and activity. I suggest turning off these functions. Another thing I do is use the Standard URLs. I think Google has an easier time with them. Also using Standard URLs you never have an issue with thread title changes and URLs showing the old thread title.

        Comment


        • #5
          This isn't a vBulletin issue. Those are features that can be disabled or, if you want to keep the features, simply create an appropriate robots.txt file. (You can also make the members list invisible to guests.)

          Code:
          User-agent: *
          Disallow: /archive/
          Disallow: /calendar.php
          Disallow: /cron.php
          Disallow: /editpost.php
          Disallow: /joinrequests.php
          Disallow: /login.php
          Disallow: /member.php
          Disallow: /misc.php
          Disallow: /moderator.php
          Disallow: /newreply.php
          Disallow: /newthread.php
          Disallow: /online.php
          Disallow: /printthread.php
          Disallow: /private.php
          Disallow: /profile.php
          Disallow: /register.php
          Disallow: /search.php
          Disallow: /sendmessage.php
          Disallow: /showgroups.php
          Disallow: /subscription.php
          Disallow: /subscriptions.php
          Disallow: /threadrate.php
          Disallow: /usercp.php
          Last edited by djbaxter; Wed 29th Aug '12, 12:27pm.
          Psychlinks Web Services Affordable Web Design & Site Management
          Specializing in Small Businesses and vBulletin/Xenforo Forums

          Comment


          • #6
            Don't list folders that won't normally be crawled in your robots.txt

            There are no public facing links to an /admincp/ /modcp/ /backup/ folders in your software as it is. Don't give hackers direct knowledge of these locations.

            Comment


            • #7
              Don't list folders that won't normally be crawled in your robots.txt

              There are no public facing links to an /admincp/ /modcp/ /backup/ folders in your software as it is. Don't give hackers direct knowledge of these locations.

              Comment


              • #8
                Good point. Twice.
                Psychlinks Web Services Affordable Web Design & Site Management
                Specializing in Small Businesses and vBulletin/Xenforo Forums

                Comment


                • #9
                  @djbaxter thank you for the list for robot.txt. It is a great help.

                  Comment


                  • #10
                    See modified version per comments by Zachery above.
                    Psychlinks Web Services Affordable Web Design & Site Management
                    Specializing in Small Businesses and vBulletin/Xenforo Forums

                    Comment


                    • #11
                      Originally posted by djbaxter View Post
                      This isn't a vBulletin issue. Those are features that can be disabled or, if you want to keep the features, simply create an appropriate robots.txt file. (You can also make the members list invisible to guests.)

                      Code:
                      User-agent: *
                      Disallow: /archive/
                      Disallow: /calendar.php
                      Disallow: /cron.php
                      Disallow: /editpost.php
                      Disallow: /joinrequests.php
                      Disallow: /login.php
                      Disallow: /member.php
                      Disallow: /misc.php
                      Disallow: /moderator.php
                      Disallow: /newreply.php
                      Disallow: /newthread.php
                      Disallow: /online.php
                      Disallow: /printthread.php
                      Disallow: /private.php
                      Disallow: /profile.php
                      Disallow: /register.php
                      Disallow: /search.php
                      Disallow: /sendmessage.php
                      Disallow: /showgroups.php
                      Disallow: /subscription.php
                      Disallow: /subscriptions.php
                      Disallow: /threadrate.php
                      Disallow: /usercp.php
                      Almost all of these would not be seen by a spider, so I don't understand why you would have them in your robots.txt file.

                      Comment


                      • #12
                        @Andy

                        From Wayne Luke:

                        https://www.vbulletin.com/forum/show...=1#post2199309

                        See also:

                        https://www.vbulletin.com/forum/show...ices-in-VB-4-0
                        https://www.vbulletin.com/forum/show...n-4-robots-txt

                        - - - Updated - - -

                        Wayne's version is probably an improvement:

                        Code:
                        User-agent: *
                        Disallow: *.js
                        Disallow: /clientscript/
                        Disallow: /cpstyles/
                        Disallow: /customavatars/
                        Disallow: /customprofilepics/
                        Disallow: /images/
                        Disallow: /ajax.php
                        Disallow: /attachment.php
                        Disallow: /calendar.php
                        Disallow: /cron.php
                        Disallow: /editpost.php
                        Disallow: /global.php
                        Disallow: /image.php
                        Disallow: /inlinemod.php
                        Disallow: /joinrequests.php
                        Disallow: /login.php
                        Disallow: /member.php
                        Disallow: /memberlist.php
                        Disallow: /misc.php
                        Disallow: /moderator.php
                        Disallow: /newattachment.php
                        Disallow: /newreply.php
                        Disallow: /newthread.php
                        Disallow: /online.php
                        Disallow: /poll.php
                        Disallow: /post.php
                        Disallow: /postings.php
                        Disallow: /printthread.php
                        Disallow: /private.php
                        Disallow: /profile.php
                        Disallow: /register.php
                        Disallow: /report.php
                        Disallow: /reputation.php
                        Disallow: /search.php
                        Disallow: /sendmessage.php
                        Disallow: /showgroups.php
                        Disallow: /showpost.php
                        Disallow: /subscription.php
                        Disallow: /threadrate.php
                        Disallow: /usercp.php
                        Disallow: /usernote.php
                        Psychlinks Web Services Affordable Web Design & Site Management
                        Specializing in Small Businesses and vBulletin/Xenforo Forums

                        Comment


                        • #13
                          Hi djbaxter,

                          I suppose one should see just how many of those disallows are being indexed.

                          One can enter the following into Google to see, here's an example:

                          site:vbulletin.com/forum/subscription.php

                          or

                          site:vbulletin.com/forum/member.php

                          Some things you would want Google to index, for example the member.php pages. On my forum I have about 7,000 members and Google indexed about 10,000 member.php pages. Not sure where the extra 3,000 results are.

                          Comment


                          • #14
                            Why would anyone want the member.php pages indexed? What value is that adding to search or to your forum?

                            As for your search suggestion, that won't work - it isn't member.php itself that is indexed. A spider would follow member.php to index pages returned by that executable. Using your own example, there are not 7000 or 10000 copies of member.php on your vBulletin site or any other vBulletin site.
                            Psychlinks Web Services Affordable Web Design & Site Management
                            Specializing in Small Businesses and vBulletin/Xenforo Forums

                            Comment


                            • #15
                              Originally posted by djbaxter View Post
                              Why would anyone want the member.php pages indexed? What value is that adding to search or to your forum?
                              The member.php pages are those like this one:

                              https://www.vbulletin.com/forum/memb...38842-djbaxter

                              Why wouldn't you want Google indexing your members, it's important information. For example if I wanted to see if djbaxter is on any other forums or a spammer I can get that information if forums allowed the member.php to be index by search engines.

                              Originally posted by djbaxter View Post
                              As for your search suggestion, that won't work - it isn't member.php itself that is indexed. A spider would follow member.php to index pages returned by that executable. Using your own example, there are not 7000 or 10000 copies of member.php on your vBulletin site or any other vBulletin site.
                              Each member is indexed. So if you have 7,000 members theoretically Google should index 7,000 times the member.php page.

                              Comment

                              widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
                              Working...
                              X