Welcome to Vestris Inc.
Internet Interactive Solutions Company



Software Documentation

Chapter 3. Indexing and Configuration

Table of Contents
Is there any way to tell Alkaline to index a directory tree of files rather than using the crawler to just follow links?
What are all the file extensions Alkaline indexes?
How can I setup the search engine to be able to search multiple groups? Users should be able to select what part of my site they want to search.
How can I setup Alkaline to index multiple sites. Users should be able to select which site to search.
I have a site with 100 documents. Only 10 are being indexed. How can I find out why?
Alkaline has been running for quite some time, but no indexes have been written. Why?
ExcludeWords doesn't seem to work, when I search "with" which has been excluded by a dictionary I still get results. Why?
The stats say that 40 urls are reported and that 11 have been indexed, what about the other 29?
Can Alkaline index PDF documents? Results from PDF documents come as garbage binary data starting with %PDF-1.2, why?
Error (52338) couldn't execute 'gzip -d -q \s87.Z when using pdf2text.
Error (0): pdf file is damaged - attempting to reconstruct xref table... when using pdf2text.
When reindexing, virtually every dynamically generated page is coming back as "modified" each time, although no information should be changing in these pages at all, and there's no date/time in the pages or anything else that might be changing. Why?
When reindexing, I see [url][verified] output and it takes a lot of time, what is it and can I avoid it?
I have found nothing about .asp files in the documentation about filters. Can Alkaline index .asp files?
Alkaline is substituting characters like & (used in the uri's) and á (and other vocals with accents used in the text) by & and á. How can I avoid this? How does Alkaline support languages such as Spanish, Russian or French?
The online management shows the asearch.cnf contents, but they are different since I have changed the asearch.cnf. Why?
My server has statistics generated by a server-side include command (SSI) or a php script. Because of the spider, these statistics are wrong. When Alkaline is indexing a page, how do I avoid executing ssi or php code?

Is there any way to tell Alkaline to index a directory tree of files rather than using the crawler to just follow links?

No, not specifically. Alkaline will parse your directory list if you make it available via the web. Then, Alkaline will take the directory list as a normal HTML document.

This is logical, since you can access those documents via the web, you should be able to point Alkaline to those documents for indexing purposes.