Telling Recoll about the handler

There are two elements that link a file to the handler which should process it: the association of file to MIME type and the association of a MIME type with a handler.

The association of files to MIME types is mostly based on name suffixes. The types are defined inside the mimemap file. Example:


.doc = application/msword

If no suffix association is found for the file name, recent Recoll will use libmagic. Older versions or specially built ones may try to execute a system command (typically file -i or xdg-mime).

The second element is the association of MIME types to handlers in the mimeconf file. A sample will probably be better than a long explanation:


          [index]
          application/msword = exec antiword -t -i 1 -m UTF-8;\
          mimetype = text/plain ; charset=utf-8

          application/ogg = exec rclogg

          text/rtf = exec unrtf --nopict --html; charset=iso-8859-1; mimetype=text/html

          application/x-chm = execm rclchm.py
        

The fragment specifies that:

  • application/msword files are processed by executing the antiword program, which outputs text/plain encoded in utf-8.

  • application/ogg files are processed by the rclogg script, with default output type (text/html, with encoding specified in the header, or utf-8 by default).

  • text/rtf is processed by unrtf, which outputs text/html. The iso-8859-1 encoding is specified because it is not the utf-8 default, and not output by unrtf in the HTML header section.

  • application/x-chm is processed by a persistent handler. This is determined by the execm keyword.