[ | Date | | | 2022-10-29 22:22 -0400 | ] |
[ | Mod. | | | 2022-10-31 16:27 -0400 | ] |
Recently, I finally moved my personal e-mail service to a Debian 11.5 server, from a years-out-of-date version.
For years, I had heard of notmuch being a modern program able to replace mairix to search mail.
In the process of configuring my services, I accidentally discovered that mu (package maildir-utils
), which I had installed solely for its mkdir
subcommand, supported indexing and searching. This caused a problem: I suddenly had alternatives, and would have to decide which one to use. Performance comparisons of both exist online, but this seems strange to me in hindsight, because:
Both notmuch and mu index and search very quickly on current systems, making any difference of performance between the two inconsequential;
the two tools appear to have completely different goals:
As far as I can tell, mu is meant to be mainly a search tool. It expects its input to be e-mails in Maildir format. Its index is just an index; it doesn’t replicate the full contents of each input e-mail, but rather just what is needed to point from search terms to the path of the message on disk.
Conversely, notmuch seems to be mainly an e-mail client: its “index” really includes all that is needed to display and generally interact with e-mails, and whole clients are built that interface solely with that database.
That made the choice easy for me: mu. notmuch’s non-traditional model of tagging felt confusing and scary, I was unsure if it would be possible to serve e-mail via standard protocols such as IMAP, and, finally, it looked like I would have to jump through hoops to be able to keep using my preferred e-mail client.
This compares a relatively recent version of mu1 with my experience of a years-old version of mairix, so it may not be fair. Still:
I believe it’s possible to query mu while its database is being updated, which mairix did not allow; however
Reindexing is a matter of seconds, at most, thanks to optimizations (as far as I can tell, mostly avoiding rereading input that is known not to have changed), so that even if writes blocked reads it would not matter in practice. My install of mairix took minutes (I’m not sure how many) to reindex, forcing me to have a relatively stale index most of the time;
Searching is much faster than I was used to, and I was happy with mairix. For example, searching for a random term that has 661 matches in my corpus of e-mail took less than 200 ms. It usually feels like my search results are before my finger is off the return key.
Years of being used to mairix’s interface (mairix <search expression>
with a configuration file saying what to do with matches) led me to write this crude wrapper to mu:
## Interface to `mu find' that vaguely mimics mairix + configuration
set -euo pipefail
DOUT=$MAILDIR/res
mu find \
--format=links --linksdir="$DOUT" --clearlinks -- \
"$@"
find "$DOUT"/cur -type l -printf 1 | wc -c | \
jq --compact-output '{matches: .}'
Pass <search expression>
to this script and it will put matches in a mailbox named res
within the user’s $MAILDIR
directory, where an e-mail client can go list and read them.
The part at the end with jq
is to print a count of matches for the user to see so they can decide whether to refine the query or go look at the results. Normally, mu prints nothing at all when matches are found.
mu (mail indexer/searcher) version 1.4.15 from 2021-01-23↩︎
Quick links: