Why I don't have a Dat archive of my newsletter


Every month or two for the past 7 years, I send a personal email newsletter to my friends and family. I share the newsletters via email and manually save an archived copy to Google Drive and a local Perkeep instance. While Dat is built to create repositories for easy archiving and access, it has problems around privacy that prevent me from using it to share a repository of my newsletter editions.

Why an email newsletter?

Alexandra and I don't share often on social media, so a newsletter is our way of keeping our friends and family up-to-date with our activities. We love how personal an email feels compared to other options.

Email has some nice properties:

Some properties I don't like about email:

Recently, I started using Dat and Beaker Browser to publish my personal website to dat://www.timswast.com/. When I publish to my own site, I feel a similar sense of personal connection with those who read it as I do with email. Eventually, I'd like to migrate to Beaker and Dat for sharing my newsletter. For me, the main benefit of Dat is that my friends and family can save official replicas of the letters. This makes it more feasible to keep the content alive long-term. Despite the benefits, there are a few problems with Dat that prevent me from using it to share my private newsletter.

Problems with Dat

Accidentally making a Dat repository public

If you have the URL for a Dat repository, then you can get to the content. This works well for public repositories, but it's a problem when I want the content to stay private. Someone could easily accidentally copy and paste a URL into the wrong place, leaking the whole repository. Once I share a Dat repository with a few people, I as the author have little control over who can get access to it.

Leaking IP addresses

With Dat and Beaker Browser's current implementation of peer discovery, a database entry contains all the IP addresses of peers currently serving each repository, keyed by the hash of the public key. In the case of my private newsletter repository, all the IP addresses belong to either me or my friends and family. This is too much of my family's personal data in one place.

Requirements for sharing private repositories

Requirement 1 - Repository confidentiality

I want to be confident that only those I share my newsletters with are able to view them. It should be clear that the repository is meant to stay private and difficult to accidentally make the repository public.

Requirement 2 - Reader confidentiality

Since there very few readers, all of whom I know well, reader privacy becomes extra challengingly. It would be quite creepy to call up an ex-girlfriend and say "I see you're reading the newsletter right now from your parents' house". Yet, by watching the IP addresses of those that are connecting to download the content, this would be possible to know, especially because there are not many newsletter recipients.

I choose not to watch closely at the peers listing, and I don't log IP addresses. But, I'd feel much better if I did not have access to this information at all.

Requirement 3 - Secure delivery

With email spam filters, I'm always a bit uncertain as to whether my friends and family can read my message. With Dat I'm pretty confident that, so long as I or some other peer is hosting the content, my friends and family can get it. Solutions for private shared content should not break this property.

Requirement 4 - Offline and secure repository replicas

With a public Dat repository, the content can remain readable and verifiable well into the future. Solutions for private repositories should retain these archival qualities.

Confidentiality comes at some expense to archivability. For one thing, there are fewer peers with redundant copies, though I hope that I can convince a few family members that my newsletters are worth keeping a copy pinned.

Encryption also can be counter to the goal of archivability. A possible solution for confidentiality is to encrypt the content before sharing it, but this makes it impossible for my family members to decipher the files when they browse them on disk. If I lose the decryption key, then I've forever lost the content—the opposite of what I'd want in an archive. I'm willing to accept some risk in confidentiality for better archivability.

Bonus A - Notifications

Email is a push system. My friends and family get an update in realtime after I send it. It would be nice if there was a way that my friends and family could watch for changes on the index file of my family newsletter and see when there have been updates. This feature would be nice to have, but not necessary. I can always share that there has been an update via other channels.

Bonus B - Retain linkability

One of the benefits of Dat is that it is possible to link between repositories. It'd be good if whatever solution is found for private repositories could allow for links between private repositories. When repository A has a different set of authorized readers from repository B, a link from A to B should be visible for all readers of A but only travellable for allowed readers of B.

Bonus C - Private replying

Just as in email, where it's possible to reply privately and see the email in the context of the thread, I'd love if Dat provided a mechanism for this. The combination of Bonus A (notifications) and Bonus B (linkability) could cover this if folks used the IndieWeb convention of replies being a kind of post type, but I think it's worth considering this separately.

Bonus D - Ability to make a repository public when I'm ready

I'd like my newsletters to eventually be public, but not while I'm actively publishing new editions. It should be possible to make a repository public without having to recreate the repository and risk losing the history tables that Dat creates. If not supported, it should remain possible to add new readers as they request access.

Future work

Currently Beaker is not focussed on anonymous access or publishing, but I'm hopeful that Dat and Beaker will begin to look at these use cases soon. Even for public data, it's dangerous to have a database of the IP address of all peers for each content archive. To me, it feels a bit like a panopticon, where you never know who might be watching to see what you're reading.

I think reading content in the chorus should be more like pulling a book from your home bookshelf. In my next post, I'll outline some ideas of how Dat could evolve so that reading articles in a repository gets closer to this cozy feeling.


on (Syndicated to Twitter):

Dat and Beaker Browser are working great to host and browse public sites, but they have some problems around privacy that need to be resolved before I can use it to host private posts.