Robots txt

Robots txt is a file generated by webmasters to give some instructions to search engine that how they should treat their websites. When any search engine reaches your website for example it will check for and will follow the instructions explained there.

Why is this important?

In some cases, if you do not want search engines to crawl your website or few pages on your website then you mention that in this file.

If you want search engines not to crawl your website, then you use following code in your Robots txt file.

User-agent: *

Disallow: /

Now, one may ask that why we do not want search engines to crawl our website? This can happen in various conditions, suppose you have something which is password protected and only a person who have access to that particular page can access it than you need to tell search engines that do not crawl that website or a particular directory or a page.

Case Study

In normal cases where the websites are small and you want search engines to crawl all the information you have on your website, but still there could be at least one page that you do not want to be crawled by search engines.

In our example, we have a very small website and we want search engines to crawl our complete website but there is one page called contact.php in our website, which is just a simple email form page. Anyone submit the contact form on our page than this contact.php page process that information and email to us.

There is no issue if the search engines crawls that page also, but that is completely unnecessary, we do not want to show that page to our customers, so we simply disallow that page, like in following example.

User-Agent: *

Disallow: /contact.php

We may disallow multiple pages or directories as per our requirements like

User-Agent: *

Disallow: /mail.php

Disallow: /contact.php

Disallow: /images/


Remember that Robots txt is an instruction file for search engines, but any spam bots or spammers can access it, so this is not for the sake of your website security.

We must have to place our Robots.txt file in the root directory of our website, same place where we positioned our sitemap.

