Asynchronous extension hooks with Perl

Friday, 24 April 2009

My usual projects involve event-driven asynchronous servers, and as a general guideline I try to make them as a set of loosely coupled components, running on the same process space.

To tie them together, I create a set of extension hooks on each component. Those hooks can be used for notification, or for delegation of responsibilities.

For example, in a XMPP component, the Roster controller usually has two hooks, new_buddy, and accept_as_budy.

The first can be hooked by other components to be notified of new contacts that where accepted as buddies. For example, if you wanted to send a welcome message, or create some files or directories per buddy, this is the hook to use.

The accept_as_budy is different. The Roster component wants to delegate the responsibility of deciding if a specific JID should be allowed or not to become your buddy. When we receive a <presence type='subscribe' /> request, the Roster would execute this hook, and all other components that want to weight in on the decision can give their opinion, accepting, refusing, or ignoring the request.

The usual system for these type of extension points is a set of coderef's that you register per hook. Those coderef's are executed in turn, and have some way to signal back their decision, either by returning a specific return code, or by calling a method in a event object that the hook system creates per invocation.

There are a lot of such systems at CPAN. Some examples, in no particular order, are: Event::Notify, Object::Event, Class::Observable and Class::Publisher.

All of those modules allow you to create hooks, accept registrations from other components, and then invoke the hooks with a set of parameters that all of those handlers will have access to.

The problem starts when you want to use these approach inside an event-driven asynchronous application, and one of the handlers of a hook needs to block. For example, if one of my components needs to do a couple of HTTP or memcached queries to decide if a specific buddy can be accepted, the above classes wont work anymore. The usual implementation of these systems keeps the execution control inside themselves with a loop over all the registered handlers, calling them one by one in turn.

In an asynchronous app, we would like to allow each handler to start some asynchronous request and make the decision after he gets the response. We should be able to start an HTTP request to check a specific JID, and then, when we get the response back, decide if we want to accept, reject or ignore this buddy.

The first time I found something like this was inside DJabberd code base. DJabberd is a XMPP server where everything is a plugin. The server has a lot of hook points that you can register handlers for. Each handler gets the parameters, and a set of functions that he can call to accept, deny or ignore the current event. Each handler can do whatever he wants, including starting other requests, and when he has all the information to decide, call one of the functions.

The problem is that the code was not isolated for reuse by other projects. So I took most of the ideas, and created Async::Hooks.

Async::Hooks allows you to create hooks or extensions points in your code. Other interested parties can hook a coderef that will be called whenever the hook is used.

Each handler receives a control object. The control object has access to all arguments passed at the time of the hook invocation, and a set of methods to move to the next element in the chain, or to stop processing altogether. You are required to call one of those methods when you are done with the event. The next handler will only be called if you do that.

Usage is simple. You start by creating a Async::Hooks instance. This acts as a registry for hooks and its handlers. There is no need to pre-declare hooks. Usually I use a singleton pattern for my registry, and hide everything inside a small class:

package App::Hooks;

use strict;
use warnings;
use base qw( Exporter );
use Async::Hooks;

@App::Hooks::EXPORT = qw( hooks );

{
  # Hide my hooks
  my $hooks;

  sub hooks {
    return $hooks ||= Async::Hooks->new;
  }
}

1;

This gives us a hooks() function that returns the same Async::Hooks instance every time.

Registering a handler is simple:

hooks()->hook('accept_as_budy', \&check_bad_jids);

Handlers will be called in first-come-first-served order. There are no mechanisms to alter the order of handlers. I never needed such features, but the plan is to include a simple Async::Hooks::WithRank class in the next release, that will provide a simple rank-based order. Other mechanism would be possible after that, you just need to subclass the main class and implement a sort() method.

To invoke a hook, you simply do:

hooks()->call(
    'accept_as_budy',
    [ @args ],
    \&reply_to_presence_subscribe
);

The reply_to_presence_subscribe coderef is optional. If present it will be called last.

Each handler receives a control object and a copy of the arguments:

sub check_bad_jids {
  my ($ctl, $args) = @_;
  ...
}

The $ctl object provides you with the tools to communicate your decision. You can $ctl->next or $ctl->decline to move forward in the chain, or you can stop processing with $ctl->stop or $ctl->done.

So if your keep a web service for bad JIDs, you could do something like this:

sub check_bad_jids {
  my ($ctl, $args) = @_;
  my $jid = $args->[0];

  http_get("$webservice_endpoint/jid/$jid/is_bad", sub {
       my ($resp) = @_;

       if (defined($resp) && $resp eq 'BAD') {
         $ctl->done;
       }
       else {
         $ctl->next;
       }
  });
}

The http_get() function is part of the AnyEvent::HTTP module, and performs a non-blocking asynchronous HTTP GET request. The callback is called when the response arrives with the data in the first parameter. If the call was successful, the response is defined and contains the body of the response.

The call to done() will skip all other possible handlers and call the \&reply_to_presence_subscribe given in the call() invocation. This cleanup callback will receive the same arguments as all other handlers, and a third parameter, a $is_done flag. It the chain ended with a call to done() this flag will be true.

I really like this approach to hooks. Although there is a lot more responsability on the side of handler programmers (they must remember to call $ctl->next() or $ctl->done() sometime), it also allows for several different scenarios. In the example above, the hook is for delegation of responsibility so it makes sense to call next() only after a decision has been made, but you could also have handlers that are only interested on the fact that someone requested to be your buddy, not the outcome, and those handlers can start a background process and call $ctl->next() immediately.

In fact, you can use Async::Hooks even inside a more synchronous program: you are not required to delay your calls to next() or done().

Currently at version 0.5, it has most of what I need. Next versions will have the possibility of ordered handlers, with a basic rank-based example, and a better way to communicate our decision to stop, or ignore the current event.