« March 2009 | Main | May 2009 »

April 24, 2009

Asynchronous extension hooks with Perl

My usual projects involve event-driven asynchronous servers, and as a general guideline I try to make them as a set of loosely coupled components, running on the same process space.

To tie them together, I create a set of extension hooks on each component. Those hooks can be used for notification, or for delegation of responsibilities.

For example, in a XMPP component, the Roster controller usually has two hooks, new_buddy, and accept_as_budy.

The first can be hooked by other components to be notified of new contacts that where accepted as buddies. For example, if you wanted to send a welcome message, or create some files or directories per buddy, this is the hook to use.

The accept_as_budy is different. The Roster component wants to delegate the responsibility of deciding if a specific JID should be allowed or not to become your buddy. When we receive a <presence type='subscribe' /> request, the Roster would execute this hook, and all other components that want to weight in on the decision can give their opinion, accepting, refusing, or ignoring the request.

The usual system for these type of extension points is a set of coderef's that you register per hook. Those coderef's are executed in turn, and have some way to signal back their decision, either by returning a specific return code, or by calling a method in a event object that the hook system creates per invocation.

There are a lot of such systems at CPAN. Some examples, in no particular order, are: Event::Notify, Object::Event, Class::Observable and Class::Publisher.

All of those modules allow you to create hooks, accept registrations from other components, and then invoke the hooks with a set of parameters that all of those handlers will have access to.

The problem starts when you want to use these approach inside an event-driven asynchronous application, and one of the handlers of a hook needs to block. For example, if one of my components needs to do a couple of HTTP or memcached queries to decide if a specific buddy can be accepted, the above classes wont work anymore. The usual implementation of these systems keeps the execution control inside themselves with a loop over all the registered handlers, calling them one by one in turn.

In an asynchronous app, we would like to allow each handler to start some asynchronous request and make the decision after he gets the response. We should be able to start an HTTP request to check a specific JID, and then, when we get the response back, decide if we want to accept, reject or ignore this buddy.

The first time I found something like this was inside DJabberd code base. DJabberd is a XMPP server where everything is a plugin. The server has a lot of hook points that you can register handlers for. Each handler gets the parameters, and a set of functions that he can call to accept, deny or ignore the current event. Each handler can do whatever he wants, including starting other requests, and when he has all the information to decide, call one of the functions.

The problem is that the code was not isolated for reuse by other projects. So I took most of the ideas, and created Async::Hooks.

Async::Hooks allows you to create hooks or extensions points in your code. Other interested parties can hook a coderef that will be called whenever the hook is used.

Each handler receives a control object. The control object has access to all arguments passed at the time of the hook invocation, and a set of methods to move to the next element in the chain, or to stop processing altogether. You are required to call one of those methods when you are done with the event. The next handler will only be called if you do that.

Usage is simple. You start by creating a Async::Hooks instance. This acts as a registry for hooks and its handlers. There is no need to pre-declare hooks. Usually I use a singleton pattern for my registry, and hide everything inside a small class:

package App::Hooks;

use strict;
use warnings;
use base qw( Exporter );
use Async::Hooks;

@App::Hooks::EXPORT = qw( hooks );

{
  # Hide my hooks
  my $hooks;

  sub hooks {
    return $hooks ||= Async::Hooks->new;
  }
}

1;

This gives us a hooks() function that returns the same Async::Hooks instance every time.

Registering a handler is simple:

hooks()->hook('accept_as_budy', \&check_bad_jids);

Handlers will be called in first-come-first-served order. There are no mechanisms to alter the order of handlers. I never needed such features, but the plan is to include a simple Async::Hooks::WithRank class in the next release, that will provide a simple rank-based order. Other mechanism would be possible after that, you just need to subclass the main class and implement a sort() method.

To invoke a hook, you simply do:

hooks()->call(
    'accept_as_budy',
    [ @args ],
    \&reply_to_presence_subscribe
);

The reply_to_presence_subscribe coderef is optional. If present it will be called last.

Each handler receives a control object and a copy of the arguments:

sub check_bad_jids {
  my ($ctl, $args) = @_;
  ...
}

The $ctl object provides you with the tools to communicate your decision. You can $ctl->next or $ctl->decline to move forward in the chain, or you can stop processing with $ctl->stop or $ctl->done.

So if your keep a web service for bad JIDs, you could do something like this:

sub check_bad_jids {
  my ($ctl, $args) = @_;
  my $jid = $args->[0];

  http_get("$webservice_endpoint/jid/$jid/is_bad", sub {
       my ($resp) = @_;

       if (defined($resp) && $resp eq 'BAD') {
         $ctl->done;
       }
       else {
         $ctl->next;
       }
  });
}

The http_get() function is part of the AnyEvent::HTTP module, and performs a non-blocking asynchronous HTTP GET request. The callback is called when the response arrives with the data in the first parameter. If the call was successful, the response is defined and contains the body of the response.

The call to done() will skip all other possible handlers and call the \&reply_to_presence_subscribe given in the call() invocation. This cleanup callback will receive the same arguments as all other handlers, and a third parameter, a $is_done flag. It the chain ended with a call to done() this flag will be true.

I really like this approach to hooks. Although there is a lot more responsability on the side of handler programmers (they must remember to call $ctl->next() or $ctl->done() sometime), it also allows for several different scenarios. In the example above, the hook is for delegation of responsibility so it makes sense to call next() only after a decision has been made, but you could also have handlers that are only interested on the fact that someone requested to be your buddy, not the outcome, and those handlers can start a background process and call $ctl->next() immediately.

In fact, you can use Async::Hooks even inside a more synchronous program: you are not required to delay your calls to next() or done().

Currently at version 0.5, it has most of what I need. Next versions will have the possibility of ordered handlers, with a basic rank-based example, and a better way to communicate our decision to stop, or ignore the current event.

Walk like an iron man

Matt Trout and friends came up with an idea of a small game to start people blogging about Perl.

The rules are simple:

  • blog once per week;
  • every week.

The definition of week is a bit lax, allowing you have at most 10 days between posts.

Mine will appear under the Perl category, and if you want to follow just those, you can subscribe to the Perl category feed.

April 21, 2009

Merging two unrelated repositories

I keep a repository for operations-style stuff. Server configurations, Puppet recipes, old CFEngine stuff, the works.

But for some idiotic reason long lost in time, my DNS setup was in a totally unrelated repository. This didn't make that much sense. Even more stupid: other stuff, unrelated to DNS, was also stored in there.

So I had:

  • a oss/ repo with the good and clean stuff to configure all the servers I manage;
  • a dns/ repo with all the DNS configuration stuff, and a bunch of other unrelated configurations.

And the goal: merge the history of the network/dns/ directory in the dns/ repo to the dnssrv/ directory of the oss/ repo.

Like this:

# create a copy of my dns/ repo
git clone dns dns_work && cd dns_work

# remove everything except the network/dns/ directory
git filter-branch --prune-empty --subdirectory-filter network/dns -- --all

# move stuff back to the dnssrv/ directory
git filter-branch -f --prune-empty --tree-filter '
   mkdir -p .dnssrv;
   mv * .dnssrv;
   mv .dnssrv dnssrv
' -- --all

# make sure we clean all the cruft
git gc --aggressive

# ok, prepare to merge
cd ../oss && git remote add dns ../dns_work && git fetch dns

# Merge...
git merge dns/master

# Remove the cruft
git remote rm dns && git gc --aggressive

And with that, all the history of my DNS work is merged back into the oss/ repo in the proper directory.

As a final step, I need to remove the network/dns commits from the dns/ repo:

# prepare...
cd ../dns

# Remove the old directory and any empty commits lying around
git filter-branch -f --prune-empty --tree-filter 'rm -rf network' HEAD

# Cleanup
git gc --aggressive

Done.

One of the common complaints I hear about git is that it allows you to rewrite the history. Although this operation can be very damaging in a repository that is heavily cloned, banning, or making history rewriting a second-class operation feels like banning hammers because you can harm yourself. Git is a tool, and it should make complex, but at times useful operations like this, easy or at least possible.

Moving

The server where this site was hosted is dying a slow death.

I'm moving my personal stuff to a slice in the next few days, so expect some disturbance.

One of the road-blocks after email is DNS. My .com, .org and .net domains are hosted at Joker, and so far I'm happy with the service. But I also have other domains in the .pt, .im, .as, and .tv top level domains, and for those, Joker is not a solution.

I did a bit of shopping around and the prices are ridiculous. Most places ask for $20 per year to host a domain. I really don't like to do this myself, but with the number of domains that I host, and with those prices, its just cheaper to do it myself.

April 03, 2009

Slow

I'm having disk problems with the server that hosts this blog and my mail.

The I/O wait time is always above 50%, and sometimes it gets to 100%. All this without I/O, at least according to iostat.

I'm moving stuff around, starting with email, but things are a bit slow, and will be for the next day or so. If you really need to contact me, use the alternative methods at the top of the sidebar on my blog homepage.

Contacts

melo@simplicidade.org (XMPP/email)
+351 302 029 050 (voice)
melopt (Skype)

IronMan challenge

Iron Man badge Are you ready to be an Iron Man? Join the challenge and find out! (what is the meaning of this little man?)

Moosaico

Junta-te!

Recent Comments

Powered by Disqus
Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.2