The Recoll Python programming interface can be used both for searching and for creating/updating an index with a program run by the Python3 interpreter. It is available on all platforms (Unix-like systems, MS Windows, MacOS).
The search interface is used in a number of active projects: the Recoll Gnome Shell Search Provider, the Recoll Web UI, and the upmpdcli UPnP Media Server, in addition to many small scripts.
The index updating part of the API can be used to create and update Recoll indexes. Up to Recoll 1.37 these needed to use separate configurations (but could be queried in conjunction with the regular index). As of Recoll 1.37, an external indexer based on the Python extension can update the main index. For example the Recoll indexer for the Joplin notes application is using this method.
The search API is modeled along the Python database API version 2.0 specification (early versions used the version 1.0 spec).
The recoll
package contains two modules:
The
recoll
module contains functions and classes used to query or update the index.The
rclextract
module contains functions and classes used at query time to access document data. This can be used, for example, for extracting embedded documents into standalone files.
There is a good chance that your system repository has packages for the Recoll Python API, sometimes in a package separate from the main one (maybe named something like python3-recoll). Else refer to the Building from source chapter.
As an introduction sample, the following small program will run a query and list the
title and url for each of the results. The python/samples
source
directory contains several examples of Python programming with Recoll, exercising the
extension more completely, and especially its data extraction features.
#!/usr/bin/python3 from recoll import recoll db = recoll.connect() query = db.query() nres = query.execute("some query") results = query.fetchmany(20) for doc in results: print("%s %s" % (doc.url, doc.title))
You can also take a look at the source for (in order of complexity) the Recoll Gnome Shell Search Provider or WebUI, and the upmpdcli local media server.