Update: some of this configuration changed. See mail.

I've been using Notmuch since about 2011, switching away from Mutt to deal with the monstrous amount of emails I was, and still am dealing with on the computer. I have contributed a few patches and configs on the Notmuch mailing list, but basically, I have given up on merging patches, and instead have a custom config in Emacs that extend it the way I want. In the last 5 years, Notmuch has progressed significantly, so I haven't found the need to patch it or make sweeping changes.

  1. The huge INBOX of death
  2. Sieve filtering
    1. Dovecot and Postfix configs
    2. Sieve rules
    3. Spam filtering
    4. Mailing lists
    5. Extension matching
    6. Automated reports filtering
  3. Refiltering emails
    1. Archival
  4. Notmuch configuration
  5. Offlineimap
  6. RSS feeds
  7. Troubleshooting
  8. Further improvements
  9. Security considerations

The huge INBOX of death

The one thing that is problematic with my use of Notmuch is that I end up with a ridiculously large INBOX folder. Before the cleanup I did this morning, I had over 10k emails in there, out of about 200k emails overall.

Since I mostly work from my laptop these days, the Notmuch tags are only on the laptop, and not propagated to the server. This makes accessing the mail spool directly, from webmail or simply through a local client (say Mutt) on the server, really inconvenient, because it has to load a very large spool of mail, which is very slow in Mutt. Even worse, a bunch of mail that was archived in Notmuch shows up in the spool because it's just removed tags in Notmuch: the mails are still in the inbox, even though they are marked as read.

So I was hoping that Notmuch would help me deal with the giant inbox of death problem, but in fact, when I don't use Notmuch, it actually makes the problem worse. Today, I did a bunch of improvements to my setup to fix that.

The first thing I did was to kill procmail, which I was surprised to discover has been dead for over a decade. I switched over to Sieve for filtering, having already switched to Dovecot a while back on the server. I tried to use the procmail2sieve.pl conversion tool but it didn't work very well, so I basically rewrote the whole file. Since I was mostly using Notmuch for filtering, there wasn't much left to convert.

Sieve filtering

But this is where things got interesting: Sieve is so simpler to use and more intuitive that I started doing more interesting stuff in bridging the filtering system (Sieve) with the tagging system (Notmuch). Basically, I use Sieve to split large chunks of emails off my main inbox, to try to remove as much spam, bulk email, notifications and mailing lists as possible from the larger flow of emails. Then Notmuch comes in and does some fine-tuning, assigning tags to specific mailing lists or topics, and being generally the awesome search engine that I use on a daily basis.

Dovecot and Postfix configs

For all of this to work, I had to tweak my mail servers to talk sieve. First, I enabled sieve in Dovecot:

--- a/dovecot/conf.d/15-lda.conf
+++ b/dovecot/conf.d/15-lda.conf
@@ -44,5 +44,5 @@

 protocol lda {
   # Space separated list of plugins to load (default is global mail_plugins).
-  #mail_plugins = $mail_plugins
+  mail_plugins = $mail_plugins sieve

Then I had to switch from procmail to dovecot for local delivery, that was easy, in Postfix's perennial main.cf:

#mailbox_command = /usr/bin/procmail -a "$EXTENSION"
mailbox_command = /usr/lib/dovecot/dovecot-lda -a "$RECIPIENT"

Note that dovecot takes the full recipient as an argument, not just the extension. That's normal. It's clever, it knows that kind of stuff.

One last tweak I did was to enable automatic mailbox creation and subscription, so that the automatic extension filtering (below) can create mailboxes on the fly:

--- a/dovecot/conf.d/15-lda.conf
+++ b/dovecot/conf.d/15-lda.conf
@@ -37,10 +37,10 @@
 #lda_original_recipient_header =

 # Should saving a mail to a nonexistent mailbox automatically create it?
-#lda_mailbox_autocreate = no
+lda_mailbox_autocreate = yes

 # Should automatically created mailboxes be also automatically subscribed?
-#lda_mailbox_autosubscribe = no
+lda_mailbox_autosubscribe = yes

 protocol lda {
   # Space separated list of plugins to load (default is global mail_plugins).

Sieve rules

Then I had to create a Sieve ruleset. That thing lives in ~/.dovecot.sieve, since I'm running Dovecot. Your provider may accept an arbitrary ruleset like this, or you may need to go through a web interface, or who knows. I'm assuming you're running Dovecot and have a shell from now on.

The first part of the file is simply to enable a bunch of extensions, as needed:

# Sieve Filters
# http://wiki.dovecot.org/Pigeonhole/Sieve/Examples
# https://tools.ietf.org/html/rfc5228
require "fileinto";
require "envelope";
require "variables";
require "subaddress";
require "regex";
require "vacation";
require "vnd.dovecot.debug";

Some of those are not used yet, for example I haven't tested the vacation module yet, but I have good hopes that I can use it as a way to announce a special "urgent" mailbox while I'm traveling. The rationale is to have a distinct mailbox for urgent messages that is announced in the autoreply, that hopefully won't be parsable by bots.

Spam filtering

Then I filter spam using this fairly standard expression:

# spam 
# possible improvement, server-side:
# http://wiki.dovecot.org/Pigeonhole/Sieve/Examples#Filtering_using_the_spamtest_and_virustest_extensions
if header :contains "X-Spam-Flag" "YES" {
  fileinto "junk";
} elsif header :contains "X-Spam-Level" "***" {
  fileinto "greyspam";

This puts stuff into the junk or greyspam folder, based on the severity. I am very aggressive with spam: stuff often ends up in the greyspam folder, which I need to check from time to time, but it beats having too much spam in my inbox.

Mailing lists

Mailing lists are generally put into a lists folder, with some mailing lists getting their own folder:

# lists
# converted from procmail
if header :contains "subject" "FreshPorts" {
    fileinto "freshports";
} elsif header :contains "List-Id" "alternc.org" {
    fileinto "alternc";
} elsif header :contains "List-Id" "koumbit.org" {
    fileinto "koumbit";
} elsif header :contains ["to", "cc"] ["lists.debian.org",
                                       "anarcat@debian.org"] {
    fileinto "debian";
# Debian BTS
} elsif exists "X-Debian-PR-Message" {
    fileinto "debian";
# default lists fallback
} elsif exists "List-Id" {
    fileinto "lists";

The idea here is that I can safely subscribe to lists without polluting my mailbox by default. Further processing is done in Notmuch.

Extension matching

I also use the magic +extension tag on emails. If you send email to, say, foo+extension@example.com then the emails end up in the foo folder. This is done with the help of the following recipe:

# wildcard +extension
# http://wiki.dovecot.org/Pigeonhole/Sieve/Examples#Plus_Addressed_mail_filtering
if envelope :matches :detail "to" "*" {
  # Save name in ${name} in all lowercase except for the first letter.
  # Joe, joe, jOe thus all become 'Joe'.
  set :lower "name" "${1}";
  fileinto "${name}";
  #debug_log "filed into mailbox ${name} because of extension";

This is actually very effective: any time I register to a service, I try as much as possible to add a +extension that describe the service. Of course, spammers and marketers (it's the same really) are free to drop the extension and I suspect a lot of them do, but it helps with honest providers and this actually sorts a lot of stuff out of my inbox into topically-defined folders.

It is also a security issue: someone could flood my filesystem with tons of mail folders, which would cripple the IMAP server and eat all the inodes, 4 times faster than just sending emails. But I guess I'll cross that bridge when I get there: anyone can flood my address and I have other mechanisms to deal with this.

The trick is to then assign tags to all folders so that they appear in the Notmuch-emacs welcome view:

echo tagging folders
for folder in $(ls -ad $HOME/Maildir/${PREFIX}*/ | egrep -v "Maildir/${PREFIX}(feeds.*|Sent.*|INBOX/|INBOX/Sent)\$"); do
    tag=$(echo $folder | sed 's#/$##;s#^.*/##')
    notmuch tag +$tag -inbox tag:inbox and not tag:$tag and folder:${PREFIX}$tag

This is part of my notmuch-tag script that includes a lot more fine-tuned filtering, detailed below.

Automated reports filtering

Another thing I get a lot of is machine-generated "spam". Well, it's not commercial spam, but it's a bunch of Nagios, cron jobs, and god knows what software thinks it's important to send me emails every day. I get a lot less of those these days since I'm off work at Koumbit, but still, those can be useful for others as well:

if anyof (exists "X-Cron-Env",
          header :contains ["subject"] ["security run output",
                                        "monthly run output",
                                        "daily run output",
                                        "weekly run output",
                                        "Debian Package Updates",
                                        "Debian package update",
                                        "daily mail stats",
                                        "Anacron job",
                                        "changes report",
                                        "run output",
                                        "Undelivered mail",
                                        "Postfix SMTP server: errors from",
                                        "DenyHosts report",
                                        "Debian security status",
           header :contains "Auto-Submitted" "auto-generated",
           envelope :contains "from" ["nagios@",
    fileinto "rapports";
# imported from procmail
elsif header :comparator "i;octet" :contains "Subject" "Cron" {
  if header :regex :comparator "i;octet"  "From" ".*root@" {
        fileinto "rapports";
elsif header :comparator "i;octet" :contains "To" "root@" {
  if header :regex :comparator "i;octet"  "Subject" "\\*\\*\\* SECURITY" {
        fileinto "rapports";
elsif header :contains "Precedence" "bulk" {
    fileinto "bulk";

Refiltering emails

Of course, after all this I still had thousands of emails in my inbox, because the sieve filters apply only on new emails. The beauty of Sieve support in Dovecot is that there is a neat sieve-filter command that can reprocess an existing mailbox. That was a lifesaver. To run a specific sieve filter on a mailbox, I simply run:

sieve-filter .dovecot.sieve INBOX 2>&1 | less

Well, this doesn't do anything. To really execute the filters, you need the -e flags, and to write to the INBOX for real, you need the -w flag as well, so the real run looks something more like this:

sieve-filter -e -W -v .dovecot.sieve INBOX > refilter.log 2>&1

The funky output redirects are necessary because this outputs a lot of crap. Also note that, unfortunately, the fake run output differs from the real run and is actually more verbose, which makes it really less useful than it could be.


I also usually archive my mails every year, rotating my mailbox into an Archive.YYYY directory. For example, now all mails from 2015 are archived in a Archive.2015 directory. I used to do this with Mutt tagging and it was a little slow and error-prone. Now, i simply have this Sieve script:

require ["variables","date","fileinto","mailbox", "relational"];

# Extract date info
if currentdate :matches "year" "*" { set "year" "${1}"; }

if date :value "lt" :originalzone "date" "year" "${year}" {
  if date :matches "received" "year" "*" {
    # Archive Dovecot mailing list items by year and month.
    # Create folder when it does not exist.
    fileinto :create "Archive.${1}";

I went from 15613 to 1040 emails in my real inbox with this process (including refiltering with the default filters as well).

Notmuch configuration

My Notmuch configuration is a in three parts: I have small settings in ~/.notmuch-config. The gist of it is:


# synchronize_flags=true
# tentative patch that was refused upstream
# http://mid.gmane.org/1310874973-28437-1-git-send-email-anarcat@koumbit.org


I omitted the fairly trivial [user] section for privacy reasons and [database] for declutter.

Then I have a notmuch-tag script symlinked into ~/Maildir/.notmuch/hooks/post-new. It does way too much stuff to describe in details here, but here are a few snippets:

if hostname | grep angela > /dev/null; then

This sets a variable that makes the script work on my laptop (angela), where mailboxes are in Maildir/Anarcat/foo or the server, where mailboxes are in Maildir/.foo.

I also have special rules to tag my RSS feeds, which are generated by feed2imap, which is documented shortly below:

echo tagging feeds
( cd $HOME/Maildir/ && for feed in ${PREFIX}feeds.*; do
    name=$(echo $feed | sed "s#${PREFIX}feeds\\.##")
    notmuch tag +feeds +$name -inbox folder:$feed and not tag:feeds
done )

Another example that would be useful is how to tag mailing lists, for example, this removes the inbox tag and adds the notmuch tags to emails from the notmuch mailing list.

notmuch tag +lists +notmuch      -inbox tag:inbox and "to:notmuch@notmuchmail.org"

Finally, I have a bunch of special keybindings in ~/.emacs.d/notmuch-config.el:

;; autocompletion
(eval-after-load "notmuch-address"

; use fortune for signature, config is in custom
(add-hook 'message-setup-hook 'fortune-to-signature)
; don't remember what that is
(add-hook 'notmuch-show-hook 'visual-line-mode)

;;; keymappings
(define-key notmuch-show-mode-map "S"
  (lambda ()
    "mark message as spam and advance"
    (notmuch-show-tag '("+spam" "-unread"))

(define-key notmuch-search-mode-map "S"
  (lambda (&optional beg end)
    "mark message as spam and advance"
    (interactive (notmuch-search-interactive-region))
    (notmuch-search-tag (list "+spam" "-unread") beg end)

(define-key notmuch-show-mode-map "H"
  (lambda ()
    "mark message as spam and advance"
    (notmuch-show-tag '("-spam"))

(define-key notmuch-search-mode-map "H"
  (lambda (&optional beg end)
    "mark message as spam and advance"
    (interactive (notmuch-search-interactive-region))
    (notmuch-search-tag (list "-spam") beg end)

(define-key notmuch-search-mode-map "l" 
  (lambda (&optional beg end)
    "undelete and advance"
    (interactive (notmuch-search-interactive-region))
    (notmuch-search-tag (list "-unread") beg end)

(define-key notmuch-search-mode-map "u"
  (lambda (&optional beg end)
    "undelete and advance"
    (interactive (notmuch-search-interactive-region))
    (notmuch-search-tag (list "-deleted") beg end)

(define-key notmuch-search-mode-map "d"
  (lambda (&optional beg end)
    "delete and advance"
    (interactive (notmuch-search-interactive-region))
    (notmuch-search-tag (list "+deleted" "-unread") beg end)

(define-key notmuch-show-mode-map "d"
  (lambda ()
    "delete current message and advance"
    (notmuch-show-tag '("+deleted" "-unread"))

;; https://notmuchmail.org/emacstips/#index17h2
(define-key notmuch-show-mode-map "b"
  (lambda (&optional address)
    "Bounce the current message."
    (interactive "sBounce To: ")
    (message-resend address)

;;; my custom notmuch functions
(defun anarcat/notmuch-search-next-thread ()
  "Skip to next message from region or point

This is necessary because notmuch-search-next-thread just starts
from point, whereas it seems to me more logical to start from the
end of the region."
  ;; move line before the end of region if there is one
  (unless (= beg end)
    (goto-char (- end 1)))

;; Linking to notmuch messages from org-mode
;; https://notmuchmail.org/emacstips/#index23h2
(require 'org-notmuch nil t)

(message "anarcat's custom notmuch config loaded")

This is way too long: in my opinion, a bunch of that stuff should be factored in upstream, but some features have been hard to get in. For example, Notmuch is really hesitant in marking emails as deleted. The community is also very strict about having unit tests for everything, which makes writing new patches a significant challenge for a newcomer, which will often need to be familiar with both Elisp and C. So for now I just have those configs that I carry around.

Emails marked as deleted or spam are processed with the following script named notmuch-purge which I symlink to ~/Maildir/.notmuch/hooks/pre-new:


if hostname | grep angela > /dev/null; then

echo moving tagged spam to the junk folder
notmuch search --output=files tag:spam \
        and not folder:${PREFIX}junk \
        and not folder:${PREFIX}greyspam \
        and not folder:Koumbit/INBOX \
        and not path:Koumbit/** \
    | while read file; do
          mv "$file" "$HOME/Maildir/${PREFIX}junk/cur"

echo unconditionnally deleting deleted mails
notmuch search --output=files tag:deleted | xargs -r rm

Oh, and there's also customization for Notmuch:

;; -*- mode: emacs-lisp; auto-recompile: t; -*-
 ;; from https://anarc.at/fortunes.txt
 '(fortune-file "/home/anarcat/.mutt/fortunes.txt")
 '(message-send-hook (quote (notmuch-message-mark-replied)))
 '(notmuch-address-command "notmuch-address")
 '(notmuch-always-prompt-for-sender t)
 '(notmuch-crypto-process-mime t)
    ((".*@koumbit.org" . "Koumbit/INBOX.Sent")
     (".*" . "Anarcat/Sent"))))
 '(notmuch-hello-tag-list-make-query "tag:unread")
 '(notmuch-message-headers (quote ("Subject" "To" "Cc" "Bcc" "Date" "Reply-To")))
    ((:name "inbox" :query "tag:inbox and not tag:koumbit and not tag:rt")
     (:name "unread inbox" :query "tag:inbox and tag:unread")
     (:name "unread" :query "tag:unred")
     (:name "freshports" :query "tag:freshports and tag:unread")
     (:name "rapports" :query "tag:rapports and tag:unread")
     (:name "sent" :query "tag:sent")
     (:name "drafts" :query "tag:draft"))))
    (("deleted" :foreground "red")
     ("unread" :weight bold)
     ("flagged" :foreground "blue"))))/
 '(notmuch-search-oldest-first nil)
 '(notmuch-show-all-multipart/alternative-parts nil)
 '(notmuch-show-all-tags-list t)
    (notmuch-wash-convert-inline-patch-to-part notmuch-wash-tidy-citations notmuch-wash-elide-blank-lines notmuch-wash-excerpt-citations)))

I think that covers it.


So of course the above works well on the server directly, but how do run Notmuch on a remote machine that doesn't have access to the mail spool directly? This is where OfflineIMAP comes in. It allows me to incrementally synchronize a local Maildir folder hierarchy with a a remote IMAP server. I am assuming you already have an IMAP server configured, since you already configured Sieve above.

Note that other synchronization tools exist. The other popular one is isync but I had trouble migrating to it (see courriels for details) so for now I am sticking with OfflineIMAP.

The configuration is fairly simple:

accounts = Anarcat
ui = Blinkenlights
maxsyncaccounts = 3

[Account Anarcat]
localrepository = LocalAnarcat
remoterepository = RemoteAnarcat
# refresh all mailboxes every 10 minutes
autorefresh = 10
# run notmuch after refresh
postsynchook = notmuch new
# sync only mailboxes that changed
quick = -1
## possible optimisation: ignore mails older than a year
#maxage = 365

# local mailbox location
[Repository LocalAnarcat]
type = Maildir
localfolders = ~/Maildir/Anarcat/

# remote IMAP server
[Repository RemoteAnarcat]
type = IMAP
remoteuser = anarcat
remotehost = anarc.at
ssl = yes
# without this, the cert is not verified (!)
sslcacertfile = /etc/ssl/certs/DST_Root_CA_X3.pem
# do not sync archives
folderfilter = lambda foldername: not re.search('(Sent\.20[01][0-9]\..*)', foldername) and not re.search('(Archive.*)', foldername)
# and only subscribed folders
subscribedonly = yes
# don't reconnect all the time
holdconnectionopen = yes
# get mails from INBOX immediately, doesn't trigger postsynchook
idlefolders = ['INBOX']

Critical parts are:

The other settings should be self-explanatory.

RSS feeds

I gave up on RSS readers, or more precisely, I merged RSS feeds and email. The first time I heard of this, it sounded like a horrible idea, because it means yet more emails! But with proper filtering, it's actually a really nice way to process emails, since it leverages the distributed nature of email.

For this I use a fairly standard feed2imap, although I do not deliver to an IMAP server, but straight to a local Maildir. The configuration looks like this:

include-images: true
target-refix: &target "maildir:///home/anarcat/Maildir/.feeds."
- name: Planet Debian
  url: http://planet.debian.org/rss20.xml
  target: [ *target, 'debian-planet' ]

I have obviously more feeds, the above is just and example. This will deliver the feeds as emails in one mailbox per feed, in ~/Maildir/.feeds.debian-planet, in the above example.


You will fail at writing the sieve filters correctly, and mail will (hopefully?) fall through to your regular mailbox. Syslog will tell you things fail, as expected, and details are in your .dovecot.sieve.log file in your home directory.

I also enabled debugging on the Sieve module

--- a/dovecot/conf.d/90-sieve.conf
+++ b/dovecot/conf.d/90-sieve.conf
@@ -51,6 +51,7 @@ plugin {
        # deprecated imapflags extension in addition to all extensions were already
   # enabled by default.
   #sieve_extensions = +notify +imapflags
+  sieve_extensions = +vnd.dovecot.debug

   # Which Sieve language extensions are ONLY available in global scripts. This
   # can be used to restrict the use of certain Sieve extensions to administrator

This allowed me to use debug_log function in the rulesets to output stuff directly to the logfile.

Further improvements

Of course, this is all done on the commandline, but that is somewhat expected if you are already running Notmuch. Of course, it would be much easier to edit those filters through a GUI. Roundcube has a nice Sieve plugin, and Thunderbird also has such a plugin as well. Since Sieve is a standard, there's a bunch of clients available. All those need you to setup some sort of thing on the server, which I didn't bother doing yet.

And of course, a key improvement would be to have Notmuch synchronize its state better with the mailboxes directly, instead of having the notmuch-purge hack above. Dovecot and Maildir formats support up to 26 flags, and there were discussions about using those flags to synchronize with notmuch tags so that multiple notmuch clients can see the same tags on different machines transparently.

This, however, won't make Notmuch work on my phone or webmail or any other more generic client: for that, Sieve rules are still very useful.

I still don't have webmail setup at all: so to read email, I need an actual client, which is currently my phone, which means I need to have Wifi access to read email. "Internet Cafés" or "this guy's computer" won't work as well, although I can always use ssh to login straight to the server and read mails with Mutt.

I am also considering using X509 client certificates to authenticate to the mail server without a passphrase. This involves configuring Postfix, which seems simple enough. Dovecot's configuration seems a little more involved and less well documented. It seems that both [OfflimeIMAP][] and K-9 mail support client-side certs. OfflineIMAP prompts me for the password so it doesn't get leaked anywhere. I am a little concerned about building yet another CA, but I guess it would not be so hard...

The server side of things needs more documenting, particularly the spam filters. This is currently spread around this wiki, mostly in mail.

Security considerations

The whole purpose of this was to make it easier to read my mail on other devices. This introduces a new vulnerability: someone may steal that device or compromise it to read my mail, impersonate me on different services and even get a shell on the remote server.

Thanks to the two-factor authentication I setup on the server, I feel a little more confident that just getting the passphrase to the mail account isn't sufficient anymore in leveraging shell access. It also allows me to login with ssh on the server without trusting the machine too much, although that only goes so far... Of course, sudo is then out of the question and I must assume that everything I see is also seen by the attacker, which can also inject keystrokes and do all sorts of nasty things.

Since I also connected my email account on my phone, someone could steal the phone and start impersonating me. The mitigation here is that there is a PIN for the screen lock, and the phone is encrypted. Encryption isn't so great when the passphrase is a PIN, but I'm working on having a better key that is required on reboot, and the phone shuts down after 5 failed attempts. This is documented in my phone setup.

Client-side X509 certificates further mitigates those kind of compromises, as the X509 certificate won't give shell access.

Basically, if the phone is lost, all hell breaks loose: I need to change the email password (or revoke the certificate), as I assume the account is about to be compromised. I do not trust Android security to give me protection indefinitely. In fact, one could argue that the phone is already compromised and putting the password there already enabled a possible state-sponsored attacker to hijack my email address. This is why I have an OpenPGP key on my laptop to authenticate myself for critical operations like code signatures.

The risk of identity theft from the state is, after all, a tautology: the state is the primary owner of identities, some could say by definition. So if a state-sponsored attacker would like to masquerade as me, they could simply issue a passport under my name and join a OpenPGP key signing party, and we'd have other problems to deal with, namely, proper infiltration counter-measures and counter-snitching.


Thanks for awesome article. I found another way to filter via dovecot https://wiki2.dovecot.org/HowTo/RefilterMail

Comment by wigust
Comments on this page are closed.
Created . Edited .