The Recoll WebUI offers an alternative, WEB-based, interface for querying a Recoll index.

The koniu repository on GitHub has not been updated for some time, and you should now use the git clone on framagit.org.

The WebUI can be quite useful to extend the use of a shared index to multiple workstations, without the need for a local Recoll installation and shared data storage.

The Recoll WebUI is based on the Bottle Python framework.

The default setup of the standalone script is now to rely on the waitress Python WSGI server, which can handle several simultaneous requests and will probably have acceptable performance in most cases. You need to install the waitress python3 module for things to work.

It is still possible to run the WEBUI on the Bottle internal HTTP server, by editing the startup script. However the built-in server is restricted to handling one request at a time, which is problematic in multi-user situations, especially because some requests, like extracting a result list into a CSV file, can take a significant amount of time.

In multi-user situations, you may get better performance and ease of use from the Recoll WebUI by running it under Apache rather than as a standalone process. With this approach, a few requests per second can easily be handled even in the presence of long-running ones, and some measure of access control is probably possible.

However, neither Recoll nor the WebUI are optimized for high multi-user load, and it would be very unwise to use them as the search interface to a busy WEB site.

The instructions about using the WebUI under Apache as given in the repository README are a bit terse, and are missing a few details.

Here follow the synopses of two WebUI installations on initially Apache-less Ubuntu (14.04) and DragonFly BSD systems. The first should extend easily to other Debian-based systems, the second at least to FreeBSD. rpm-based systems are left as an exercise to the reader, at least for now…​

I am not checking these instructions very often, and you may have to change some details related to packages version numbers.

Caution
THE CONFIGURATIONS DESCRIBED HAVE NO ACCESS CONTROL. ANYONE WITH ACCESS TO THE NETWORK WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY DOCUMENT.

Apache

On a Debian/Ubuntu system

Install recoll

sudo apt-get install recoll python3-recoll

Configure the indexing and check that the normal search works (I spent quite a lot of time trying to understand why the WebUI did not work, when in fact it was the normal recoll configuration which was broken and the regular search did not work either).

Take care to be logged in as the user you want to run the web search as while you do this.

Install the WebUI

Clone the github repository, or extract the master tar installation, and move it to '/var/www/recoll-webui-master/'. Take care that it is read/execute accessible by your user.

Install Apache and mod-wsgi

sudo apt-get install apache2 libapache2-mod-wsgi-py3

I then got the following message:

AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message

To clear it, I added a ServerName directive to the Apache config, maybe you won’t need it. Edit '/etc/apache2/sites-available/000-default.conf' and add the following at the top (globally). Things work without this fix anyway, this is just to suppress the error message. You probably need to adjust the address or use a real host name:

ServerName 192.168.4.6

Edit '/etc/apache2/mods-enabled/wsgi.conf', add the following at the end of the "IfModule" section.

Change the user ('dockes' in the example) taking care that he is the one who owns the index ('.recoll' is in his home directory).

WSGIDaemonProcess recoll user=dockes group=dockes \
    threads=1 processes=5 display-name=%{GROUP} \
    python-path=/var/www/recoll-webui-master
WSGIScriptAlias /recoll /var/www/recoll-webui-master/webui-wsgi.py
<Directory /var/www/recoll-webui-master>
        WSGIProcessGroup recoll
        Require all granted
</Directory>

The Require line would have been the following with apache 2.2

       Order allow,deny
       allow from all

Again: please do not take any hint about security from this document.

You can use SetEnv directives with RECOLL_CONFDIR and RECOLL_EXTRACONFDIRS variable names inside <Directory> sections to set up multiple indexes on multiple URLs or query additional indexes from a single one. You need webui code from 2022-06-01 or newer for this to work.

You can use a Setenv directive inside your Directory section to set the configuration directory with RECOLL_CONFDIR:

    <Directory /var/www/recoll-webui-master>
            WSGIProcessGroup recoll
            Require all granted
            SetEnv RECOLL_CONFDIR /path/to/my/configdir
    </Directory>

All the directories in the path must be accessible to the user/group apache uses, which may not be the case if you are using your own configuration directory ($HOME is usually not be browsable by "other").

Another possibility is to set the corresponding os.environ values by editing webui-wsgi.py (see the comments in there, which works with all versions.

Note
the Recoll WebUI application is mostly single-threaded, so it is of little use (and may actually be counter-productive in some cases) to specify multiple threads on the WSGIDaemonProcess line. Specify multiple processes instead to put multiple CPUs to work on simultaneous requests.

Then run the following to restart Apache:

sudo apachectl restart

The Recoll WebUI should now be accessible. on 'http://my.server.com/recoll/'

Note
Take care that you need a '/' at the end of the URL used to access the search (use: 'http://my.server.com/recoll/', not 'http://my.server.com/recoll'), else files other than the script itself are not found (the page looks weird and the search does not work).
Caution
THERE IS NO ACCESS CONTROL. ANYONE WITH ACCESS TO THE NETWORK WHERE THE SERVER IS LOCATED CAN RETRIEVE ANY DOCUMENT.

Apache Variant for BSD/ports

Packages

As root:

pkg install recoll

Do what you need to do to configure the indexing and check that the normal search works.

Take care to be logged in as the user you want to run the web search as while you do this.

Then install apache. You may have to adjust the version number.

pkg install apache24

Add apache24_enable="YES" in /etc/rc.conf

pkg install www/mod_wsgi
pkg install git

The package may be named ap24-mod_wsgi4 depending on the system.

On FreeBSD, you can also use:

cd /usr/ports/www/mod_wsgi4/ && make install clean

Thanks to D.Gessel for pointing out the errors in the previous version of this document.

Clone the webui repository

cd /usr/local/www/apache24/
git clone https://github.com/koniu/recoll-webui.git recoll-webui-master

Important: most input handler helper applications (e.g. 'pdftotext') are installed in '/usr/local/bin' which is not in the PATH as seen by Apache (at least on DragonFly). The simplest way to fix this is to modify the launcher module for the webui app so that it fixes the PATH.

Edit 'recoll-webui-master/webui-wsgi.py' and add the following line after the 'import os' line:

os.environ['PATH'] = os.environ['PATH'] + ':' + '/usr/local/bin'

Configure Apache

Edit /usr/local/etc/apache24/modules.d/270_mod_wsgi.conf

Uncomment the LoadModule line, and add the directives to alias /recoll/ to the webui script.

Change the user (dockes in the example) taking care that he is the one who owns the index (.recoll is in his home directory).

Contents of the file:

## $FreeBSD$
## vim: set filetype=apache:
##
## module file for mod_wsgi
##
## PROVIDE: mod_wsgi
## REQUIRE:
LoadModule wsgi_module        libexec/apache24/mod_wsgi.so
WSGIDaemonProcess recoll user=dockes group=dockes \
    threads=1 processes=5 display-name=%{GROUP} \
    python-path=/usr/local/www/apache24/recoll-webui-master/
WSGIScriptAlias /recoll /usr/local/www/apache24/recoll-webui-master/webui-wsgi.py
<Directory /usr/local/www/apache24/recoll-webui-master>
        WSGIProcessGroup recoll
        Require all granted
</Directory>

Restart Apache

As root:

apachectl restart