allows all search engine spiders crawling. If want to let a search engine spider crawling list name behind. If a repeat, write.
file name is lowercase letters.
Pay attention to the following:
Allow: /index.php allows the site of index.php
Disallow: /*.jpg all JPG files are prohibited.
The basic syntax of file should be placed in the root directory of the web siteThe
The basic concept of
must have a robot.txt file.
robots.txt file is a file website, it is to see the search engine spiders. The search engine spider crawling our website first is to grab the file, according to the contents to determine the scope of access to the web site files. It can protect some documents we are not exposed to the search engine, so as to effectively control the spider crawling path, to create the necessary conditions for our webmaster Shanghai dragon. Especially our website just created, some content is still not perfect, yet do not want to be included in the search engine when.
1) User-Agent key
in robots.txt, followed by the key: No., there must be a space, and value is separate.
does not allow the key to illustrate the path search engine spiders crawl URL.
For example: banned website file
2) Disallow key
the end of $
when you need to completely shield the file, need to meet the robots properties of meta.
behind the content is the name of each specific search engine crawler. Love is like Shanghai Baiduspider, Google is the baby bot.
content item: key: value pairs.
note: User-Agent: back to a space.
on behalf of any number of characters
Disallow: /index.php index.php
two, robots.txtThe basic format of
the key that allows the search engine spiders crawling URL path
We write this: