Saturday, April 18, 2009

Email Should Die

Here's one example of why email should die, IMHO.

Consider four users A, B, C, and D sending messages in a discussion, maybe to a list they are on.  A sends original message x.  B sends y in reply.  C sends message z and D sends message t.

In a typical scenario, here are what inboxes and sent folders look like after A sends x.

  • A
    • inbox: 
    • sent: x
  • B
    • inbox: x
    • sent: 
  • C
    • inbox: x
    • sent: 
  • D
    • inbox: x
    • sent: 

Now B sends reply y.  Note that the message will typically have x quoted at the end.

  • A
    • inbox: yx
    • sent: x
  • B
    • inbox: x
    • sent: yx
  • C
    • inbox: x, yx
    • sent: 
  • D
    • inbox: x, yx
    • sent: 
Next, C sends message z, quoting the first two.

  • A
    • inbox: yx, zyx
    • sent: x
  • B
    • inbox: x, zyx
    • sent: yx
  • C
    • inbox: x, yx
    • sent: zyx
  • D
    • inbox: x, yx, zyx
    • sent: 
Finally, D sends message t.

  • A
    • inbox: yx, zyx, tzyx
    • sent: x
  • B
    • inbox: x, zyx, tzyx
    • sent: yx
  • C
    • inbox: x, yx, tzyx
    • sent: zyx
  • D
    • inbox: x, yx, zyx
    • sent: tzyx

Note that message x has been stored on disk 16 times.  This is just with four users, each sending one messagse.  Imagine a group of 20 to 100 users, and a thread that goes on for a dozen messages or more.

Even if each person is careful about not quoting the previous messages, you have this best-case scenario.

  • A
    • inbox: y, z, t
    • sent: x
  • B
    • inbox: x, z, t
    • sent: y
  • C
    • inbox: x, y, t
    • sent: z
  • D
    • inbox: x, y, z
    • sent: t

There are still N users * M messages copies stored.  Granted, these may be stored on different email systems.

Now, compare a BBS or forum system.  All users post to and read from a particular topic in the forum.

A posts message x.

  • Forum topic: x

B posts message y.

  • Forum topic:  x, y

and so on…

  • Forum topic: x, y, z, t

Previous messages are easily read without quoting.  The storage implications are obvious.  Consider also the disk I/O bandwidth and even network bandwidth implications.

When the first scenario above occurs in an enterprise with hundreds or thousands of users, thousands of message threads, and millions of messages, the storage and bandwidth load is massive.

Add to that the idea that the email system they are using is open to the Internet with an influx of spam and other security attacks, and it's no wonder that such systems barely survive.

There are some email systems that implement message storage with one copy and pointers in each user's inbox and folders.  That certainly helps, but I argue that one may as well move to the simpler BBS, forum and topic model in the enterprise.