Recoll supports defining multiple indexes, each defined by its own configuration directory. A configuration directory contains several files which describe what should be indexed and how.
When recoll or
recollindex is first executed, it creates a default configuration
directory. This configuration is the one used for indexing and querying when no specific
configuration is specified. It is located in $HOME/.recoll/
for Unix-like systems
and %LOCALAPPDATA%/Recoll
on Windows
(typically C:/Users/[me]/Appdata/Local/Recoll
).
All configuration parameters have defaults, defined in system-wide files. Without further customisation, the default configuration will process your complete home directory, with a reasonable set of defaults. It can be adjusted to process a different area of the file system, select files in different ways, and many other things.
In some cases, it may be useful to create additional configuration directories, for example, to separate personal and shared indexes, or to take advantage of the organization of your data to improve search precision.
In order to do this, you would create an empty directory in a
location of your choice, and then instruct
recoll or recollindex to use
it by setting either a command line option (-c
/some/directory
), or an environment variable
(RECOLL_CONFDIR
=/some/directory
). Any
modification performed by the commands (e.g. configuration customisation or searches
by recoll or index creation by recollindex) would
then apply to the new directory and not to the default one.
Once multiple indexes are created, you can use each of them
separately by setting the -c
option or the
RECOLL_CONFDIR
environment variable when starting a
command, to select the desired index.
It is also possible to instruct one configuration to query one or several other indexes in addition to its own, by using the recoll GUI, or some equivalent in the command line and programming tools.
function in theA plausible usage scenario for the multiple index feature would be for a system administrator to set up a central index for shared data, that you choose to search or not in addition to your personal data. Of course, there are other possibilities. for example, there are many cases where you know the subset of files that should be searched, and where narrowing the search can improve the results. You can achieve approximately the same effect by using a directory filter clause in a search, but multiple indexes may have better performance and may be worth the trouble in some cases.
A more advanced use case would be to use multiple indexes to improve indexing performance, by updating several indexes in parallel (using multiple CPU cores and disks, or possibly several machines), and then merging them, or querying them in parallel.
See the section about configuring multiple indexes for more detail