Simplicity, Sanity and Communication

When did communication get so complicated?

Messaging apps are everywhere, half of my friends have FB Messenger, the others use WhatsApp, iOS users have iMessage, Android users have Hangouts. In one way or another everyone has SMS but who wants to be charged to send a picture? Nobody.

Then there’s hundreds of other messenger apps like Snapchat, Viber, Line, Kik, Skype, and work related ones Slack, Rocket.chat, Gitter, even messenger apps shoehorned into other apps like Twitter and Instagram.

This wouldn’t be a problem, if these apps could talk to each other, but they can’t. I don’t know about you but I have conversation threads spanning across many apps, and some even with the same person! 🤯

This got me thinking, what kind of things should a messaging app / service do?

  1. It should be able to message people or groups of people back n forth in a conversational style.

  2. It should have the ability to send links, pictures and videos. With no additional charges (thanks SMS 😒).

  3. It should be able to message anyone, regardless of what app they use to message back, so long as I know their address.

  4. It should be able to do all this asynchronously so both people don’t have to be online at the same time.

Whilst writing these requirements my thoughts instantly went to a 45 year old technology with that meets them all, email! :D

Why don’t we just use email? Surely this is what email was suppose to be? When did email become uncool?

There’s a few reasons I feel why this is the case, it would be much easier to just use one app to communicate with everyone, but a few things need fixing first:

Too much noise

The problem with noise is it’s hard to hear the conversation. This is what’s happened with email. Marketing, notifications and spam are taking over the inbox. Some effort has been made to fix this, Outlook and Gmail have made a good start, but more can be done.

A nice approach to filtering out potential noise is one by Facebook Messenger, separating all known (people you’re friends with) and unknown messages. When receiving messages from companies or people not previously on your approved list you’re asked whether you wish to accept messages from them. This approach along with verification techniques already built into email would remove almost all spam and annoying marketing emails from your inbox. Much better.

User Interface

Email has a distinct feel to it. To, subject, body, signatures, it’s all a bit formal. I understand it’s inspired by traditional letter writing, but it’s 2018 communication has changed! When I use a messenger app, I select a person and start typing what I want to say. I’m not asked for the subject of the conversation, and I don’t feel as though I should be writing them a formal letter, I just type what I want to say and send.

This feeling could be avoided by changing the way email looks, for example hiding the subject field would be a good start. The subject could be auto generated in the background instead, unless you wanted to create a chat thread for a particular subject one would be created for you and hidden out of sight. This would allow compatibility with existing email systems yet keep things clean for newer clients.

Reducing the visual size of the body field would also help, which implies the messages should be shorter and faster, like real conversations.

The inbox could also be refocused around conversations. Currently I feel as though I need to clear everything out of my inbox. But if you look at messaging apps, seeing and going back to previous conversations is the main focus. It lowers the barrier to keep the conversation going. Grouping messages not only by subject but also by people should help achieve this.

One of the worst features of current email apps I feel, is automatically including previous threads into the message. If i’m chatting to someone then they will have the previous messages i’ve sent to them, and if they’ve chosen to delete them, then so be it. Not only does it make the message hard to read or compose, it also adds to the overall size of the message which hurts performance as mentioned below.

Performance

Email is slow, well not that slow, I could send a message to someone on the other side of the world in about a second, but we want things in milliseconds!

There’s a few factors making it slow such as:

  1. Lack of push support.

  2. Email size and HTML email.

  3. Multiple providers.

Push support lets the server deliver email to your device as soon it arrives. Unfortunately it’s not built in, and requires support from email providers, luckily though Microsoft Exchange, Gmail and iCloud all have push support, and hopefully more providers adopt this.

Email is a text based format, which compared to a binary format is pretty bloated. 1-4 bytes per character adds up when you’re sending messages on a high latency mobile network. So the less characters needed to send the faster the message.

Email requires headers which inform the providers metadata about the message, i.e who it’s going to, who sent it, the time and date it was sent, etc. And a body which contains the message.

Annoyingly as mentioned above when replying to a message the previous message and all previous messages in that thread will be included underneath your response. This seems to be the default for most email clients however i’m sure there must be a way to turn it off. This doesn’t only happen in the body, but also in the header References, which contains a list of all the previous message IDs in a thread, and there’s no way to turn that off. As you can imagine, as the conversation goes on this list can grow quite large, making the size of each message several times larger than it should be, very inefficient. My suggestion is that based on the Date the message was sent and the In-Reply-To header, which references only the previous message, the client should be able to figure out where the message should be placed in the thread.

So now lets make things even more inefficient, by sending HTML email.

HTML email has been around since the late 90’s, but only received wide adoption in 2006/2007. HTML email requires extra characters (called markup) to wrap around the message you’re sending, making message sizes balloon from bytes, to sometimes hundreds of kilobytes. This must be parsed by the client, and then displayed on screen according to the HTML and CSS style code that explains how things should look. Due to this it allows for links to be hidden from the recipient. For example a link could say one thing, but actually link to a completely different thing (like malware). All the issues just mentioned could be avoided by forcing plain text email, and only allowing media as attachments. That way the client app could display the text and media how it wanted, rather than how the sender wants, similar to other messaging apps today.

Multiple providers is a tough one to solve, and in my opinion it’s a benefit as much as it is a performance problem. Having multiple providers is what makes email truly universal. Lets say i’m a Gmail user and I want to send a message to someone with a Yahoo address. The message has to make an extra hop across from Gmail to Yahoo before it ends up with the recipient. Most of these providers do a good job of making this transfer speedy, but it’s still an extra hop, and goes through extra security checks before it gets to it’s destination. All adding up to more time taken to send the message. Saying that, the smaller the message size the faster it will travel, even across extra hops.

Security

When it comes to messaging, security usually covers two things, authenticity, and privacy. Authenticity lets me know the person it says sent the message was in fact the real person, and privacy ensures that the message is only seen by who it was sent too.

As email is so old, it wasn’t originally designed with security in mind, but over the years techniques have been developed to fix these problems.

If we look at the privacy aspect there are normally two levels of encryption. Transport encryption, which ensures no one between you and the provider, or anyone between the provider and the recipient can see the contents of the message. And end to end encryption, which ensures only you and the recipient can see the contents of the message, not even the provider.

End to end encryption is by far the best for privacy, however most messaging services and email only use transport encryption.

But how does email stack up to this? The large public email providers, Gmail, Yahoo, Outlook, iCloud all support transport encryption when they can, even when passing messages across providers, however it’s never guaranteed that after you’ve passed a message to your provider, it will continue to be sent with transport encryption. End to end encryption via email is usually provided by S/MIME or PGP, but both of these aren’t automatically setup, and therefore as an additional hurdle don’t tend to be used very often.

A solution to this is to bake in client support for end to end encryption and for providers to integrate something like the signal protocol by open whisper systems, which is the default already activated encryption built into WhatsApp. There’s a great article here explaining how the signal protocol works.

Protocols

Email uses three main protocols, POP and IMAP for downloading messages to the client and SMTP for sending messages. As the crux of email is sending messages I will focus on SMTP. Unfortunately SMTP is a chatty protocol, by that I mean there’s a lot of back n forth going on just to send a single message. For example:


Server: 220 smtp.example.com ESMTP Postfix

Client: EHLO client.example.com

Server: 250-smtp.server.com Hello client.example.com
Server: 250-AUTH LOGIN PLAIN CRAM-MD5
Server: 250-STARTTLS
Server: 250 HELPClient: MAIL FROM:

Client: STARTTLS

Server: 220 TLS go ahead

Client: EHLO client.example.com *

Server: 250-smtp.server.com Hello client.example.com
Server: 250-SIZE 1000000
Server: 250-AUTH LOGIN PLAIN CRAM-MD5
Server: 250 HELP

Client: AUTH PLAIN dGVzdAB0ZXN0ADEyMzQ=

Server: 235 2.7.0 Authentication successful

Client: RCPT TO:

Server: 250 Ok

Client: DATA

Server: 354 End data with .

Client: From: "Bob Example" 
Client: To: Alice Example 
Client: Date: Tue, 15 January 2008 16:02:43 -0500
Client: Subject: Test message
Client: 
Client: Hello Alice.
Client: This is a test message.
Client: Your friend,
Client: Bob
Client: .

Server: 250 Ok: queued as 12345

Client: QUIT

Server: 221 Bye

You can see there’s many steps, logging in, specifying who the message should go to, setting up the other headers, and then sending the body (message + attachments).

This is far more chatty than protocols used by Facebook Messenger (MQTT) and WhatsApp (XMPP). Even the HTTP Rest based protocol used by Slack is more concise.

An example of the same message over HTTP would be:


Client:
  POST /send HTTP/1.1
  Host: mail.example.com
  Authorization: dGVzdAB0ZXN0ADEyMzQ=
  Date: Tue, 15 January 2008 16:02:43 -0500
  Content-Type: application/json
  Content-Length: 13
  {
    "from": "bob@example.com",
    "to": "alice@example.com",
    "date": "Tue, 15 January 2008 16:02:43 -0500",
    "subject": "Test message",
    "body": "Hello Alice.\nThis is a test message with 5 header fields and 4 lines in the message body.\nYour friend,\nBob"
  }

Server: 200 OK

There’s no reason the method above couldn’t be used to send email instead. Email services like SendGrid and Mailgun already provide APIs using HTTP to instruct them to send email.

The benefit to this approach is, if you’re sending messages to another address from the same provider, it should be just as fast as other messaging systems. But with the added ability to send messages to addresses from another providers using the normal SMTP method.


Conclusion

I believe with the right client app and filtering, email could (once again) become the de facto standard universal messaging service.

To learn more about email and how it works check out the wikipedia page.

I hope this article has been useful and thought provoking.