When it comes to email hosting, I'm quite old school. Instead of using cloud services like GMail, I like plain IMAP and SMTP. It allows me to take my emails anywhere, without being locked-in. This has served me quite well over the last 15 years.
But lately, one of my providers had quite often network issues, leaving me without access for sometimes hours. As this hoster is a small shop with just a few people running it, I've wondered: What happens if the service goes down, and never up again? What happens to my archive?
Hosting my own email server is an absolute no-go, as I don't want to constantly fight against waves of spam or other providers discriminating against unknown IPs. Life is too short to fight too many battles at the same time.
I've been researching this topic for quite a while, and in this post I describe the middle ground I settled on: Run my own backups, and have them in a simple readable format, that allows to search and process them using basic tools.
This blog post has inspired me to start the project.
You can find the code of this entire project also in this git repository.
First attempt: isync
First I've tried to use mbsync, but unfortunately I've hit multiple issues:
While isync is provided by the Debian package repository, it's really old: at the time of writing, Debian provides v1.3.0 (2017-10-01), while the current version is v1.4.3 (2021-07-29). This required me to compile for myself.
isync is configured using "streams" for each single IMAP folder. I have tons of folders, and would prefer a solution that doesn't require constant maintenance when adding/removing folders.
After more research, I've stumbled upon offlineimap, which fits my requirements: No compilation, understable configuration, and produced a useful output (Maildir format). Unfortunately it's documentation is a bit of a mess, but I got it working thanks to the well commented configuration example.
Setup offlineimap
As offlineimap is written in Python, I was able to set it up using pip. Because Debian also had quite old versions, I used the latest one from git. Note that this is the version for Python 3, not Python 2.
These instructions also differ quite a bit from the official ones, as I prefer to have packages installed in an isolated virtualenv, not somewhere in my user directory.
After executing the commands, you can run it like:
To update it later, run a git pull
and re-execute make
and setup.py
.
Configuration
It took me quite a bit of time reading the example configuration, to find out what I need. In the end, I settled for the following configuration.
In general, the offlineimap configuration works by defining one more accounts (more like profiles), that sync between repositories (either IMAP or local maildirs).
In my example the structure is like:
Account(oldserver): Repository(IMAP) => Repository(LocalBackup)
You can either save this file as ~/.offlineimaprc
(in your home directory), or at any other place and specify the -c $filename
command line argument. See the docs for a list of all options.
~/offlineimap.conf
Increasing security
Having your account credentials directly inside the file can be insecure, especially when checking it into git (as all of this code is).
To fix this, you can use the configuration options remote{host,port,user,pass}eval
to extract the values from a file or other secret managers. These options are Python code, that can run your own functions (that are loaded from the file defined in pythonfile
).
secrets.py
~/offlineimap.conf
Then doing this approach, make sure that all of your configuration files are only readable by your own user account:
Execution
To run the backup, execute the offlineimap binary:
Depending on the size of the mailbox, this can take quite a while, as IMAP is a rather slow protocol where each mail is copied one by one. After the initial sync, offlineimap runs a diff, which generally just takes a few seconds.
The result
At the end, the directory defined in localfolders
will contain a nested structure of your IMAP filters, and each email as a single file with it's raw text content.
Why I have 3 different Archive root folders? I guess because I gave up syncing folder names between my email clients. Doing things your own way isn't always pain-free...
Automation
Running this by hand is not a viable option for reliable backups, therefore I:
- Set this up on my home Raspberry Pi 4
- Run a cronjob twice a day, that syncs the mailboxes to my Synology NAS
- Send a ping to healthchecks.io after it's complete to have a notification if it's broken
Because this approach is quite specific to my workflow and environment, I haven't published the code for it (yet).