Alkaline fully supports robot directives described at the
WebCrawler robots pages,
http://info.webcrawler.com/mak/projects/robots/robots.html
.
Alkaline is a registered bot with a user-agent string: AlkalineBOT/1.9
.
This includes full compliance with the /robots.txt directives, including the User-agent and Disallow
restrictions.
Alkaline will not follow links if a <meta name="robots" content="nofollow"> tag is found.
Alkaline will not index document contents if a <meta name="robots" content="noindex"> tag is found.
Alkaline robots support can be disabled for individual configurations by
specifying Robots=N
in the asearch.cnf file.
Alkaline will look for specific meta tags in a document. Each meta tag is of format
<meta name="alkaline" content="...">.
The value of the meta tag can contain multiple elements separated by spaces and can be the following:
Table 7-3. Alkaline Specific Meta Tags
| skip |
skip indexing of the page, it will not be referenced |
| skipmeta |
skip indexing of meta tags on the current page |
| skiplinks |
do not gather links from the currently indexed page |
| skiptext |
do not index free text on the current page |
|
A <meta name="alkaline" content="skip"> tag will instruct Alkaline not just to avoid indexing
the page, but also not to gather links from it. If you do not want the page to be indexed, but the
links to be gathered, use <meta name="alkaline" content="skiptext skipmeta">.
|
If you with to exclude a pattern of pages from indexing but with links to be gathered, use
the UrlIndex
and/or the UrlSkip
directives.
Example:
<meta name="alkaline" content="skipmeta skiplinks">
|