Blocking SE Spiders

admin | Published: Feb 11, 2007 | Updated: Jun 16, 2023

Share With

I think there are certain pages or files (folders) that every webmaster should block from the SE spiders.

Dan Crow, Product Manager at Google just posted this on the official Google Blog.

I’m often asked about how Google and search engines work. One key question is: how does Google know what parts of a website the site owner wants to have show up in search results? Can publishers specify that some parts of the site should be private and non-searchable? The good news is that those who publish on the web have a lot of control over which pages should appear in search results.

The key is a simple file called robots.txt that has been an industry standard for many years.

Dan gives some examples on how to control spiders using robots.txt, or the robots meta tag.

Why would you want to limit the SEs? One example. I have a folder that contains confidential customer information. I do not want the SEs to crawl and publish that information on the Web.

Most sites need to block their https (or secure) pages. Something that Jag is not doing at this time. Do not let the spiders crawl your https pages unless your entire site is https.

All the major SEs will obey the robots.txt file, or the on page meta tag in the head section of the document.

Share With

Contact Us Now

(888) 338-5261

(719) 387-9909

[email protected]

Wordpress

Reseller

Dedicated Servers

Managed Dedicated Servers

Jaguar Pro Cloud Features

Web Service & Addons

Domain Names

Managed Service

Data Backup Service

CPanel

Jaguar Web Services

Our Company Team

JaguarPC Reviews

JaguarPC Guarantees

Contact Us

Company Blog

Why choose JaguarPC?

Blocking SE Spiders

Search Blog Topics

Leave a Reply

Blocking SE Spiders

Search Blog Topics

Connect With Us

Leave a Reply