Software Documentation
Alkaline: a UNIX/NT Search Engine
Alkaline 1.7 Frequently Asked Questions
Vestris Inc., Switzerland
Copyright © 1994-2002 by
Vestris Inc., Switzerland
Table of Contents
1.
General Questions
Is alkaline just another perl search engine? Why was it ever written?
What does the Alkaline source code look like? Can I please get the source code of Alkaline?
What is the Cellular Expansion algorithm? Is there any documentation?
Can I use Alkaline to provide a search engine on files distributed on a CD-ROM?
I don't understand anything and I am bombarding Vestris with email. Nobody has answered any of my yesterday's 25 email messages. Why?
2.
Technical Questions
Can Alkaline index 50 million documents?
Is it possible to customize Alkaline's search results display format, look and feel?
I don't want to index in background constantly, how do I setup a crontab?
What is the right way of stopping Alkaline? Is it ok to stop Alkaline with a Ctrl-C, kill -SIGTERM or kill -9?
I am not sure how to run the daemon. Once I logoff the machine, the daemon stops, why and how can I avoid that?
How do I know when my siteidx.* files are corrupt?
What is the memory consumption of Alkaline?
Alkaline consumes 100% of CPU. What can I do to limit the aggressive resource consumption during indexing?
Can I and how do I run Alkaline as an NT Service?
How does the priority scheduling work under NT and how to I reduce Alkaline's priority under Windows NT?
Why do some pages seem not to be indexed? The link is still present in the siteidx?.url file!
What is exactly a root password and why should I have one? Why is not enough to have an administrative password? This root password I need is the ordinary Unix root password I use on my Linux? Should I put it in global.cnf?
How do I setup Apache to call Alkaline?
Does Alkaline work with virtual servers?
Why does Alkaline seem to grow in memory?
How much memory does Alkaline really use? I don't understand the output from top, ps or the columns in the Windows NT Task Manager.
When I run Alkaline it suddenly crashes with a Segmentation Fault or a Bus Error message. Then it dumps a core file. What is this? Can I delete the core file?
When I look at netstat output after running Alkaline for some time, why do ports get stuck in a CLOSE_WAIT or TIME_WAIT state?
Why do I get "unable to bind, address already in use" fatal error?
I am moving from an NT server to a Linux box. Can I copy Alkaline databases from one system to another?
When I am running Alkaline, ps or top show me more than one process, why and what are those processes doing? Do 4 processes of 10Mb mean that a total of 40Mb of memory is consumed?
I have installed the search engine and made an index of my domain. But when I use the search-demo.html and try to search, a "Method Not Allowed" or "The requested method POST is not allowed for the url /index.htm" error message appears instead of search results.
How can I launch Alkaline each time my server restarts?
How many concurrent search requests (queries) does Alkaline support?
Does Alkaline work with proxies and firewalls?
Does Alkaline support refining results (searching from previous results)?
Do you provide an API to Alkaline so the product can be used in third party implementations?
Can multiple instances of Alkaline run on multiple ports with the same data files?
What is the right way to stop Alkaline?
My server is very loaded and Alkaline crashes every 24 hours. How can I improve server stability?
Apache logs are full of Alkaline's requests, can I avoid logging these?
Are there any security considerations for Alkaline? What is more secure, a CGI script or Alkaline?
Where is the admin section? I have problems navigating to the admin section? I get a password popup, a file not found, a blank page or a permission denied error.
3.
Indexing and Configuration
Is there any way to tell Alkaline to index a directory tree of files rather than using the crawler to just follow links?
What are all the file extensions Alkaline indexes?
How can I setup the search engine to be able to search multiple groups? Users should be able to select what part of my site they want to search.
How can I setup Alkaline to index multiple sites. Users should be able to select which site to search.
I have a site with 100 documents. Only 10 are being indexed. How can I find out why?
Alkaline has been running for quite some time, but no indexes have been written. Why?
ExcludeWords doesn't seem to work, when I search "with" which has been excluded by a dictionary I still get results. Why?
The stats say that 40 urls are reported and that 11 have been indexed, what about the other 29?
Can Alkaline index PDF documents? Results from PDF documents come as garbage binary data starting with %PDF-1.2, why?
Error (52338) couldn't execute 'gzip -d -q \s87.Z when using pdf2text.
Error (0): pdf file is damaged - attempting to reconstruct xref table... when using pdf2text.
When reindexing, virtually every dynamically generated page is coming back as "modified" each time, although no information should be changing in these pages at all, and there's no date/time in the pages or anything else that might be changing. Why?
When reindexing, I see [url][verified] output and it takes a lot of time, what is it and can I avoid it?
I have found nothing about .asp files in the documentation about filters. Can Alkaline index .asp files?
Alkaline is substituting characters like & (used in the uri's) and á (and other vocals with accents used in the text) by & and á. How can I avoid this? How does Alkaline support languages such as Spanish, Russian or French?
The online management shows the asearch.cnf contents, but they are different since I have changed the asearch.cnf. Why?
My server has statistics generated by a server-side include command (SSI) or a php script. Because of the spider, these statistics are wrong. When Alkaline is indexing a page, how do I avoid executing ssi or php code?
4.
Templates and Search Results
How can I have multiple templates without loading the same index multiple times?
How do I get a copy of the searched text inside my search box once search operation results show.
How can I update the template search page without restarting Alkaline?
How can I keep the case-sensitive box checked in Alkaline results?
How can I implement a Hide Summaries option?
Does Alkaline support banner ad display based on keyword searches?
Does Alkaline support other languages than English? Does Alkaline index Unicode or DBCS documents? Can I get a localized version of Alkaline?
The date shown by $date is wrong, what is the difference between $date and $modif?
My urls are entered through a set of checkboxes on the search form. When the results template is displayed, those urls get displayed. How can I avoid that?
Can I return an .asp file instead of a simple html page filled with search results?
How does Alkaline rank results? Is it possible to configure and modify the ranking?
I have file names with spaces. When Alkaline outputs results, they contain a space and the link is wrong. How can I urlencode these?
I have really long urls. When a page doesn't have a title, that really long url is output. How can I cut the title to N characters at most?
I would like to have a .php template or a dynamic cgi-bin program generate a template. Does Alkaline support cgi?
How do I enable search terms highlighting within results?
5.
Searching
Is there a way to initiate a search from within a Perl script and get the results back within the same Perl script to be formatted and displayed?
Is there a way to search for a range of values. For example, if there is a date field within a set of files, can I search for all the files dated between 7/1/99 and 7/26/99?
I would like to hide the port value when accessing Alkaline and search from http://search.server.com. How can I use the Apache proxy mechanism?
How can I integrate Alkaline search in a php website?
How can I integrate Alkaline search in a Cold Fusion cfm website?
I would like to access the search engine via https://. Does Alkaline support SSL? Can the Apache proxy mechanism be configured to filter requests and forward them to Alkaline?
I often get a Sever Busy error or an unresponsive Alkaline. How does the -mt (maximum search threads) command line parameter change this? What are the DoS mechanisms in Alkaline?
6.
Certification, Registration and License
I have problems with certification. What is a nag? Where do I get the certification key from?
How do I request a certificate? A step by step tutorial.
I don't really understand the licensing system: could I set up on Alkaline on my computer locally for testing installation purpose and customizing the output pages then generate a license? Will the license work when I move Alkaline to a different computer?
We are an internet provider and are offering our customers the possibility to search their sites, how many licenses should we purchase?
My server ip address has changed. I have moved the search engine to a new machine. Do I need to purchase a new license and should I request an unlock key again?
Does Vestris provide any special reseller deals, multiple licenses purchase options, source code licenses, etc?
I have entered the certification unlock key successfully. The nag was removed. When I restart Alkaline or reboot my server, the nag appears again. Why?
I have purchased an Alkaline license. When I go to the certification request page, it asks me for a certificate. Should I install Alkaline first or can I get the certificate beforehand?
Next
General Questions