Building a self-hosted email assistant

Building a self-hosted email assistant

A little over a year ago I began my journey into self-hosting, and today I am off Google services. Initially, I had no issues giving up a few of the more minor features I couldn't replace. Things like auto identifying events, flight information, and package tracking in emails are each small on their own, but together they simplify parsing the flood of emails. Soon I realized I needed to find a way to get these features back.

Google splendidly does all of this in "The Cloud", if you are cool with them sifting through your data. Later I found that Apple devices have some of these features implemented in a way that doesn't require you to send them your data, only use an Apple device. This was a big part of my motivation to switch from Android to iOS. Now my phone will identify contacts and calendar events in emails from my personal server, and suggest that I add them to my personal, self-hosted calendar. However in the Google world, I could also search for "my packages" or "my flights" and see an overview of everything with the latest statuses. As someone who flys a lot and doesn't own a car, these account for a part percentage of my emails.

I did some searching, and didn't quite find any software that was doing this and decided to give it a go. Since Amazon nearly replaces my car, the most frequent type of email category for me was shipping emails. So I began focusing on shipping numbers. After a little initial research, I found a few Ruby packages that parse numbers and make requests to various carrier APIs to get the status. I'm not super comfortable in Ruby and don't really care to reinvent the wheel and maintain my own parsers... so I decided to split my assistant into a collection of micro-services.

I know, I know! That's crazy! Well, it would let me use whatever language is best suited for whatever feature I'm working on. If I find that there are great package tracking Ruby Gems and great flight tracking Python Modules, I should be able to use them both. It also has the added advantage of making it easier to add in new parsers and viewers in the future.

A week later I had a set of services run with Docker Compose that crawls my inbox, extracts shipping numbers, and displays them with their status on a web page. It’s also easily configured or extended by using environment variables. Adding new parsers doesn’t even require any changes to the crawler or indexer.

I also found that, despite being five distinct services in two different languages, it was actually really straightforward using Docker and a compose file. The project can be found on git.iamthefij.com/iamthefij/email-assistant (mirrored to Github), and the compose file I use is below:

version: '2'
services:
  crawler:
    build: ./crawler
    links:
      - parser_package_tracking
      - indexer
    environment:
      IMAP_URL: ${IMAP_URL}
      IMAP_USER: ${IMAP_USER}
      IMAP_PASS: ${IMAP_PASS}
      INDEXER: http://indexer:5000
      PARSER_1: http://parser_package_tracking:3000
  indexer:
    build: ./indexer
  parser_package_tracking:
    build: ./parsers/package-tracking
  viewer_main:
    build: ./viewers/main
    links:
      - indexer
      - viewer_package_tracking
    environment:
      INDEXER_URL: http://indexer:5000
    ports:
      - "8000:5000"
  viewer_package_tracking:
    build: ./viewers/package-tracking
    environment:
      UPS_KEY: ${UPS_KEY}
      UPS_USER_ID: ${UPS_USER_ID}
      UPS_PASSWORD: ${UPS_PASSWORD}

I include the user and password information via environment variables and then expose the main viewer behind a proxy with basic auth using Traefik.

Update (Mar 12, 2018)

The final interface is very simple right now. It doesn't have a way to sort yet, or hide already delivered trackers. I also haven't validated with any providers other than Fedex, so the others are bare bones right now. Right now it looks something like this:
screenshot