Announcement

Collapse
No announcement yet.

Do you have a robots.txt file on your VB3 forum?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kwak
    replied
    Originally posted by merk
    Which has to be

    www.domain.com/robots.txt - thats the only place it will be read from
    Thanks for the help.

    Leave a comment:


  • merk
    replied
    Originally posted by Surrix
    no upload it to your main site directory.
    Which has to be

    www.domain.com/robots.txt - thats the only place it will be read from

    Leave a comment:


  • merk
    replied
    Originally posted by Shining Arcanine
    That is an pretty extensive list, are you sure that all of those spiders are ones that someone wouldn't want?

    By the way, some spiders, such as Blue Yonder (maybe I should .htaccess block them), ignore the robots.txt file.
    Those robots came from the sticky in this forum.

    And yes, some spiders will ignore robots.txt, there is a modrewrite solution in the sticky to solve that problem too

    Leave a comment:


  • Surrix
    replied
    no upload it to your main site directory.

    Leave a comment:


  • Kwak
    replied
    So I create a robots.txt with the exact content merk posted and upload on the forum index directory?

    Leave a comment:


  • Shining Arcanine
    replied
    That is an pretty extensive list, are you sure that all of those spiders are ones that someone wouldn't want?

    By the way, some spiders, such as Blue Yonder (maybe I should .htaccess block them), ignore the robots.txt file.

    Leave a comment:


  • merk
    replied
    Code:
    User-agent: TurnitinBot
    Disallow: /
    User-agent: Googlebot-Image
    Disallow: /
    User-agent: Black Hole
    Disallow: /
    User-agent: Titan
    Disallow: /
    User-agent: WebStripper
    Disallow: /
    User-agent: NetMechanic
    Disallow: /
    User-agent: CherryPicker
    Disallow: /
    User-agent: EmailCollector
    Disallow: /
    User-agent: EmailSiphon
    Disallow: /
    User-agent: WebBandit
    Disallow: /
    User-agent: EmailWolf
    Disallow: /
    User-agent: ExtractorPro
    Disallow: /
    User-agent: CopyRightCheck
    Disallow: /
    User-agent: Crescent
    Disallow: /
    User-agent: Wget
    Disallow: /
    User-agent: SiteSnagger
    Disallow: /
    User-agent: ProWebWalker
    Disallow: /
    User-agent: CheeseBot
    Disallow: /
    User-agent: ia_archiver
    Disallow: /
    User-agent: ia_archiver/1.6
    Disallow: /
    User-agent: Alexibot
    Disallow: /
    User-agent: Teleport
    Disallow: /
    User-agent: TeleportPro
    Disallow: /
    User-agent: MIIxpc
    Disallow: /
    User-agent: Telesoft
    Disallow: /
    User-agent: Website Quester
    Disallow: /
    User-agent: WebZip
    Disallow: /
    User-agent: moget/2.1
    Disallow: /
    User-agent: WebZip/4.0
    Disallow: /
    User-agent: WebSauger
    Disallow: /
    User-agent: WebCopier
    Disallow: /
    User-agent: NetAnts
    Disallow: /
    User-agent: Mister PiX
    Disallow: /
    User-agent: WebAuto
    Disallow: /
    User-agent: TheNomad
    Disallow: /
    User-agent: WWW-Collector-E
    Disallow: /
    User-agent: RMA
    Disallow: /
    User-agent: libWeb/clsHTTP
    Disallow: /
    User-agent: asterias
    Disallow: /
    User-agent: httplib
    Disallow: /
    User-agent: turingos
    Disallow: /
    User-agent: spanner
    Disallow: /
    User-agent: InfoNaviRobot
    Disallow: /
    User-agent: Harvest/1.5
    Disallow: /
    User-agent: Bullseye/1.0
    Disallow: /
    User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
    Disallow: /
    User-agent: CherryPickerSE/1.0
    Disallow: /
    User-agent: CherryPickerElite/1.0
    Disallow: /
    User-agent: WebBandit/3.50
    Disallow: /
    User-agent: NICErsPRO
    Disallow: /
    User-agent: Microsoft URL Control - 5.01.4511
    Disallow: /
    User-agent: DittoSpyder
    Disallow: /
    User-agent: Foobot
    Disallow: /
    User-agent: WebmasterWorldForumBot
    Disallow: /
    User-agent: SpankBot
    Disallow: /
    User-agent: BotALot
    Disallow: /
    User-agent: lwp-trivial/1.34
    Disallow: /
    User-agent: lwp-trivial
    Disallow: /
    User-agent: Wget/1.6
    Disallow: /
    User-agent: BunnySlippers
    Disallow: /
    User-agent: Microsoft URL Control - 6.00.8169
    Disallow: /
    User-agent: URLy Warning
    Disallow: /
    User-agent: Wget/1.5.3
    Disallow: /
    User-agent: LinkWalker
    Disallow: /
    User-agent: cosmos
    Disallow: /
    User-agent: moget
    Disallow: /
    User-agent: hloader
    Disallow: /
    User-agent: humanlinks
    Disallow: /
    User-agent: LinkextractorPro
    Disallow: /
    User-agent: Offline Explorer
    Disallow: /
    User-agent: Mata Hari
    Disallow: /
    User-agent: LexiBot
    Disallow: /
    User-agent: Web Image Collector
    Disallow: /
    User-agent: The Intraformant
    Disallow: /
    User-agent: True_Robot/1.0
    Disallow: /
    User-agent: True_Robot
    Disallow: /
    User-agent: BlowFish/1.0
    Disallow: /
    User-agent: JennyBot
    Disallow: /
    User-agent: MIIxpc/4.2
    Disallow: /
    User-agent: BuiltBotTough
    Disallow: /
    User-agent: ProPowerBot/2.14
    Disallow: /
    User-agent: BackDoorBot/1.0
    Disallow: /
    User-agent: toCrawl/UrlDispatcher
    Disallow: /
    User-agent: WebEnhancer
    Disallow: /
    User-agent: TightTwatBot
    Disallow: /
    User-agent: suzuran
    Disallow: /
    User-agent: VCI WebViewer VCI WebViewer Win32
    Disallow: /
    User-agent: VCI
    Disallow: /
    User-agent: Szukacz/1.4
    Disallow: /
    User-agent: QueryN Metasearch
    Disallow: /
    User-agent: Openfind data gathere
    Disallow: /
    User-agent: Openfind
    Disallow: /
    User-agent: Xenu's Link Sleuth 1.1c
    Disallow: /
    User-agent: Xenu's
    Disallow: /
    User-agent: Zeus
    Disallow: /
    User-agent: RepoMonkey Bait & Tackle/v1.01
    Disallow: /
    User-agent: RepoMonkey
    Disallow: /
    User-agent: Zeus 32297 Webster Pro V2.9 Win32
    Disallow: /
    User-agent: Webster Pro
    Disallow: /
    User-agent: EroCrawler
    Disallow: /
    User-agent: LinkScan/8.1a Unix
    Disallow: /
    User-agent: Keyword Density/0.9
    Disallow: /
    User-agent: Kenjin Spider
    Disallow: /
    User-agent: Cegbfeieh
    Disallow: /
    
    #ALL BOTS
    User-agent: *
    Disallow: /admincp/
    Disallow: /attachments/
    Disallow: /clientscript/
    Disallow: /cpstyles/
    Disallow: /images/
    Disallow: /modcp/
    Disallow: /subscriptions/
    Disallow: /tmp/
    Disallow: /attachment.php
    Disallow: /avatar.php
    Disallow: /cron.php
    Disallow: /editpost.php
    Disallow: /image.php
    Disallow: /joinrequests.php
    Disallow: /login.php
    Disallow: /member.php
    Disallow: /memberlist.php
    Disallow: /misc.php
    Disallow: /moderator.php
    Disallow: /newattachment.php
    Disallow: /newreply.php
    Disallow: /newthread.php
    Disallow: /online.php
    Disallow: /poll.php
    Disallow: /postings.php
    Disallow: /printthread.php
    Disallow: /private.php
    Disallow: /register.php
    Disallow: /report.php
    Disallow: /search.php
    Disallow: /sendmessage.php
    Disallow: /showgroups.php
    Disallow: /subscription.php
    Disallow: /subscriptions.php
    Disallow: /usercp.php
    Disallow: /threadrate.php
    Disallow: /usercp.php
    Disallow: /usernote.php

    Leave a comment:


  • Surrix
    replied
    So could someone with all of the files except the archive post how they have their robots.txt setup.

    Leave a comment:


  • georgec
    replied
    Originally posted by Icheb
    That thread you mentioned isn't about robots.txt confusing spiders, but to disallow access to preserve some traffic.
    Yep, though in the thread the poster mentioned how Google stopped hitting even pages she didn't disallow in robots.txt after adding that file. Of course it could have just been a coincidence that Google was slow to index her pages during that period.

    Leave a comment:


  • Icheb
    replied
    That thread you mentioned isn't about robots.txt confusing spiders, but to disallow access to preserve some traffic.

    Leave a comment:


  • Shining Arcanine
    replied
    I have one and robots aren't confused by it (why would search engine companies let their spiders be confused when they want to index pages?).

    Leave a comment:


  • georgec
    started a topic Do you have a robots.txt file on your VB3 forum?

    Do you have a robots.txt file on your VB3 forum?

    Just wondering, how many of you use a robot.txt on your VB3 forum to prevent search engines from spidering non essential/relevant pages on your forum? I've shied away from robots.txt in the past out of fear that it may confuse Google or other search engines altogether: http://www.vbulletin.com/forum/showthread.php?t=94521

Related Topics

Collapse

Working...
X