Building notes, projects, and occasional rants

On Twitter, decentralization and cost of privacy

The message announcing the kibosh on all third-party collaboration with the Twitter platform triggered several bop-ed's. You can find a nice summary over at Michael Tsai's place.

Twitter will live on as a mainstream platform, used by countless millions of regular every-day non-technical users for which these new rules are meaningless. They will suffer and question why their previous favorite client no longer works, while downloaded the latest official Twitter abortion of an interface. They will complain about it (their $Deity given Internet right), but will move on, and keep on using it.

But those of us more technically-able will probably look for a new partner, while we lick the stab wounds inflicted but our former mistress.

As always, when something wrong happens to a centralized system, the geek-gene arouses and looks for the greener pastures of decentralization. "Would it be good to have a Twitter without the central service?"

Apart of the love/hate relationship with centralized systems, of which the current cloud-fad is one mighty engine, I have to say that the only word who comes to mind about all this is dejavú

Almost-optimal solutions for this problem exist for decades. The first one I remember is good old multicast. Assuming unlimited address space, assign each person his own multicast group and let the routers take care of it. Its unpractical of course, but the architecture of any solution that emerges will have a lot of common points with this old codger.

But more recently some systems tried to tackle the problem. My pet favorite, XMPP, is one of them. We have per-user publish/subscribe that can be used to implement such a system. And we have a large network of XMPP servers already.

But XMPP has two drawbacks. The first is that the explosion of a message to all his subscribers is done at the source server. This means that when the Bieber opens his mouth and bleats, his own server would have to send his pantomime individually to all members of his flock. So even if a thousand of them were on the Google Talk network, instead of a single message, the Google servers would receive a thousand messages. I would be remiss if I didn't mention that there is a deferred extension to tackle this problem, but few in the community showed interest on it.

The second drawback is more perception than technical: over the years I felt that people don't see XMPP as Webby enough. Its a strange, dark technology, that few understand, and the passion for it by those who do, makes those who don't stagger.

But the pretty solution of a multicast network to deliver information, even in the XMPP approximation of a network of centralized servers, where each one acts as a aggregator for a part of the community, providing distinct local services for local people, has one glaring drawback: there is no privacy.

If I delegate one instance of my message to a Google aggregator that serves their clients, I can only hope that they will do the right thing and deliver said message to the right people.

If this requirement is soft for public messages, it becomes very strong for protected streams. If you want some assurances that your protected tweets don't end up in the wrong place (and I set aside the discussion about what kind of guarantees we have about that from Twitter itself in its quest to monetize your content) then you can't use this remote-server-side explosion but have to go back to exploding your messages individually to each of your approved subscribers.

There is an extra cost on resources for this kind of privacy.

So if you want to try and tackle the distributed Twitter problem, please remember that it is one of the most discussed architectures in our Internet history. It has been done before (or at least tried) several times, some with success, others less so. It doesn't matter if the next real-time micro-blogging tool will be based on Usenet, SMTP, RSS, XMPP, or the latest fad-du-jour. The basic problems are the same ones our fore- fathers tried to solve with multi-cast.

As I said, dejavú.

post scriptum rant, or why I need to support <aside> on my CSS: But is a decentralized system with some central aggregation nodes based on trusts workable? History has something to reminds us again: the usenet was the elite messaging tool amongst geeks in the early 90's (oh how I miss thee, trn). Then the AOL horde invaded and it became mainstream. The geek clan left, and later the usenet collapsed with spam, and now is a strange BitTorrent look-a-like.

The clan moved over to blogs and RSS. And those too were colonized, although not as thoroughly because that RSS thing is still a bit on the complicated side.

So the social networks were born, to hide all those pesky RSS feeds, add a dash of real-time, and place a pretty face on top of all of it.

So: do all decentralized systems evolve into spammer havens or centralized silos by sheer force of capitalism? A topic for another day.