Indexing punctuation characters (1.39)

By default, the Recoll indexer only uses most non-alphanumeric characters as separators, treating them as white space, so that inputs like all words, and all,words produce the same terms.

It may sometimes be useful to index some of these characters so that they can be used as discriminants for searches. This can be done by setting the indexedpunctuation configuration parameter. The value is an UTF-8 string, for example, setting:

indexedpunctuation = %€

would allow searching separately 100% or 100€.

The affected characters are indexed as terms with their own term positions, and they are their own separators, so that 100% and 100 % would be equivalent inputs.