Robots.txt file

From Joomla! Documentation
(Difference between revisions)
Jump to: navigation, search
Line 45: Line 45:
 
For syntax checking you can use a validator for robots.txt files.
 
For syntax checking you can use a validator for robots.txt files.
 
Try one of these:
 
Try one of these:
* [http://tool.motoricerca.info/robots-checker.phtml Motoricerca Robots.txt Checker]
+
* [http://tool.motoricerca.info/robots-checker.phtml Robots.txt Checker (by Motoricerca)]
* [http://www.frobee.com/robots-txt-check]Robots.txt Frobee Robots.txt Checker]
+
* [http://www.frobee.com/robots-txt-check Robots.txt Checker (by Frobee)]
* [http://www.searchenginepromotionhelp.com/m/robots-text-tester/robots-checker.php Search Engine Promotion Help robots.txt Checker]
+
* [http://www.searchenginepromotionhelp.com/m/robots-text-tester/robots-checker.php robots.txt Checker (by Search Engine Promotion Help)]
  
 
==Infos==
 
==Infos==

Revision as of 10:01, 11 February 2013

Contents


Web Robots (Crawlers, Web Wanderers or Spiders) are programs that traverse the Web automatically. Among many uses, search engines use them to index the web content. Robots.txt implements the REP (Robots Exclusion Protocol), which allows the web site administrator to define what parts of the site are off-limits to specific robot user agent names. Web administrators can Allow access to their web content and Disallow access to cgi, private and temporary directories, for example, if they do not want pages in those areas indexed.

Where to place my robots.txt file?

A standard robots.txt its included in your joomla root. The robots.txt file must reside in the root of the domain and must be named "robots.txt".

Joomla in a subdomain

A robots.txt file located in a subdirectory isn't valid, as bots only check for this file in the root of the domain. If the Joomla site is installed within a folder such as at e.g. www.example.com/joomla/ the robots.txt file MUST be moved to the site root at e.g. www.example.com/robots.txt . Note: The joomla folder name MUST be prefixed to the disallowed path, e.g. the Disallow rule for the /administrator/ folder MUST be changed to read Disallow: /joomla/administrator/

Joomla robots.txt contents

This is the contents of a standard Joomla robots.txt

User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/

Robot Exclusion

You can exclude directories or block robots from your site adding Disallow rule to the robots.txt

Infos:

Syntax checking

For syntax checking you can use a validator for robots.txt files. Try one of these:

Infos

For additional infos please read:

Joomla! Documentation

How to: Robots.txt and Joomla

General informations

Tools for Webmasters

Personal tools
Namespaces

Variants
Actions
Navigation
Joomla! Sites
Toolbox