Some important searchable text elements contain non-alphanumeric characters, for
example, email addresses (jfd@recoll.org
), proper names
(O'Brien
) or internet addresses
(192.168.4.1
).
If we treat the special characters as white space in this situation, the only way to
search for these terms with a reasonable degree of precision would to use phrase searches
("jf dockes org"
).
However, phrase searches need a lot of computation and are generally slower. This was especially true with older Xapian versions.
Recoll has special processing for these elements, designated as
spans
. The corresponding linkage characters will be designated as
span glue
in the following.
When indexing a span like jfd@recoll.org
, Recoll generates
both regular individual terms (jfd
,
recoll
, org
) and multiword terms
linked by span glue: jfd@recoll.org
,
jfd@recoll
, recoll.org
.
When searching, only the larger term (complete span:
jfd@recoll.org
) is used, so that Xapian executes a regular
single-term search instead of a phrase one.