Unix-like systems and Mac OS-like systems: using extended attributes

User extended attributes are named pieces of information that most modern file systems can attach to any file.

Recoll processes all extended attributes as document fields. Note that most fields are not indexed by default, you need to activate them by defining a prefix in the fields configuration file.

A freedesktop standard defines a few special attributes, which are handled as such by Recoll:

mime_type

If set, this overrides any other determination of the file MIME type.

charset

If set, this defines the file character set (mostly useful for plain text files).

By default, other attributes are handled as Recoll fields of the same name.

On Linux, the user prefix is removed from the name.

The name translation can be configured more precisely, inside the fields configuration file.

Setting the document modification/creation date

Some documents have an internal date attribute (e.g. emails), but most get their date from the file modification time. It is possible to set a document date different from the file's by setting a specific extended attribute. For obscure and uninteresting reasons, the harcoded name of the attribute is modificationdate. Its contents should be the ASCII representation of a decimal integer representing the Unix time (seconds since the epoch). An example Linux command line for setting this particular field follow. The substituted date prints the example date parameter in Unix time format (seconds since the epoch).

setfattr -n user.modificationdate -v `date -d '2022-09-30 08:30:00' +%s` /some/file

The date substitution will then be automatic, you do not need to customize the fields file.