Welcome to the JaguarPC Community
JaguarPC
Sales: (888) 338-5261
Support: (888)-551-3050
Results 1 to 6 of 6

This is a discussion on Robots txt file question in the Shared & Semi-Dedicated forum
I have a question about the robots txt file and multihosted domains. If I have a multihosted domain at http://abcd.mymaindomain.com and the servers for www.abcd.com ...

  1. #1
    JPC Senior Member
    Join Date
    Dec 2001
    Posts
    83

    Question Robots txt file question

    I have a question about the robots txt file and multihosted domains.
    If I have a multihosted domain at http://abcd.mymaindomain.com and the servers for www.abcd.com point to http://.abcd.mymaindomain.com how do I use the robot's txt file to disallow indexing of that sub domain/directory?

    I know that the following would disallow the index from the front door of my main website ie www.mymaindomain.com

    User-agent: *
    Disallow: /abcd/

    thus stopping the robots from visiting the multi hosted domain via www.mymaindomain.com/abcd/

    But what if a robot entered the multihosted sub domain/directory from www.abcd.com ?

    Should I put a disallow robots txt such as:

    User-agent: *
    Disallow: /

    into the directory of www.mymaindomain/abcd/ and treat it as if it were a regular non multihosted domain?

    Do robots read robots txt files in subdirectories or just the main domain directory?

    Hope I made this question understandable.

    Thanks for any advice.

  2. #2
    Loyal Client
    Join Date
    Feb 2003
    Location
    Mission BC, Canada
    Posts
    37
    Before you waste a lot of your time trying to get this to work, you should know that most search engine robots ignore things like the robots txt file and the noindex metatag in an HTML file. I set up a site last year that used all the various techniques I could find to tell robots to not index a file or directory, but a month after I put the site up, EVERY single page in EVERY directory was listed in Google, Yahoo, AltaVista, and ODP.

  3. #3
    O_o CeleronXL's Avatar
    Join Date
    Dec 2001
    Posts
    585
    Originally posted by Eric Pauker
    Before you waste a lot of your time trying to get this to work, you should know that most search engine robots ignore things like the robots txt file and the noindex metatag in an HTML file. I set up a site last year that used all the various techniques I could find to tell robots to not index a file or directory, but a month after I put the site up, EVERY single page in EVERY directory was listed in Google, Yahoo, AltaVista, and ODP.
    Then you didn't do it correctly.

    Any `official` crawler will abide by robots.txt files.
    "Before you critisize someone, walk a mile in their shoes. That way, when you critisize them, you're a mile away and you have their shoes."
    My Site: StarCraft Sector | My vB Forums: Forum Sector
    E-Mail: celeronxl@cox.net | AIM: CeleronXL | ICQ: 118648739 | MSNM: celeronxl@hotmail.com | YIM: celeronxl

  4. #4
    JPC Addict
    Join Date
    Oct 2002
    Posts
    148
    Originally posted by Eric Pauker
    I set up a site last year that used all the various techniques I could find to tell robots to not index a file or directory, but a month after I put the site up, EVERY single page in EVERY directory was listed in Google, Yahoo, AltaVista, and ODP.
    And to think, people spend 1000's of $$ to get listed.

  5. #5
    Community Leader jason's Avatar
    Join Date
    Sep 2001
    Location
    Rochester, NY
    Posts
    6,003
    The crawlers looks for a robots.txt file in the root of a site, so placing one in the subdirectory of the multihosted domain and/or subdomain should prevent that site from being listed.

    --Jason
    Jason Pitoniak
    Interbrite Communications
    www.interbrite.com www.kodiakskorner.com

  6. #6
    JPC Senior Member
    Join Date
    Dec 2001
    Posts
    83
    Thanks everyone. I'll put a robots file in both the main domain and the mutihosted sub directory.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •