Main

June 30, 2008

Behind the scenes of Tarpipe XMPP

Alex and Adam asked me to elaborate on the tools I used to implement the Tarpipe XMPP gateway.

I used the Net::XMPP2 Perl module, in particular the Net::XMPP2::Component class. It uses the AnyEvent async framework, which in turn support the EV library , giving you all the love of kernel polling.

Until the Net::XMPP2 author releases a new version, you should use my own copy if you plan on doing external component work (check the component-reply-with-from branch). Net::XMPP2 is widely used for bots, but not that many components, and some methods need some love to work properly in a component environment.

The HTTP part was done using the excellent AnyEvent::HTTP class. I'm using my own version, which includes a bug fix to the http_post function (on its way to release 1.03 of AnyEvent::HTTP) and adds support for HTTP::Request to http_request. I hope to see this included in the main AnyEvent::HTTP distribution but I still need to update the documentation.

The rest is just glue code.

May 11, 2008

Yada, yada, yada

From the quickies department, the Stubby exception generators (search for stubby) made it to Perl 5.

April 12, 2008

Rethinking CPAN

Andy Lester wrote an article a couple of days back about rethinking the CPAN interface. The key part of the argument is:

We don't want to "make CPAN easier to search." What we're really trying to do is help with the selection process. We want to help the user find and select the best tool for the job

I though a bit about this and my own CPAN usage over the years. I've started with Perl around 91 or 92 so I've used it a bit.

I don't know the answer for this one, but I would start with the Perl module version of the iusethis site: each person could select the list of modules that they use.

This first approach would start shuffling modules to the top.

You could then ask each person to classify why are they using the module. This is the hard part because you would need to come up with a classification scheme, like the one we still have for CPAN module, actually. It will never be perfect but I prefer to have 10 or 20 common things like "Parsing XML", or "Sending MIME email" than nothing.

An optional improvement would be to have a 4 or 5 level rank, inside your own module list, to allow you to say that, DBI is much more important than CPAN::Mini.

Another layer would be some sort of social angle. This is not to make the site look hype and fresh, but to help find new modules. I could watch a couple of persons I trust and see what are the modules they are using and the most recently added. Instead of looking through 50+ modules updates per day, I just see a filtered list of potential targets.

A final twist - a aging digg-like system: I can "This saved my bacon today"-vote on any module, but my vote would only be counted for a month, giving rise the the Bacon Savers list.

But as always, you need a carrot to make all those perl programmers compile their module list. There are two immediate carrots that I can see:

  • selective CPAN announcements: if a module on your list gets an update, you would get a notification (daily email, weekly email, or personalized RSS feed);
  • a automatic Bundle module, or Task module to install all your modules: I would not upload all this modules to CPAN, but we could create a perl script that downloads my Bundle or Task and calls the classic cpan to do the heavy lifting.

Anyhow, this are my €.02 to the conversation.

March 13, 2008

Link files in Catalyst error messages to Textmate

When you develop with Catalyst, if you have an error condition, you get a pretty interface with access to all the major objects in the request.

At the top, Catalyst will place the classical perl error message like "Caught exception in MODULE, at FILE line LINE."

This hack takes that classical format and links the "FILE line LINE" with a txmt: link. If clicked, it will open directly into your project in TextMate.

The code is simple. Stick this into your main application class:

sub finalize_error {
  my $c = shift;

  $c->NEXT::finalize_error(@_);
  return unless $c->debug;

  my $error_msg = $c->response->output;
  return unless $error_msg;

  $error_msg =~ s{(\s+at\s+)([\/]\S+)\s+line\s+(\d+)}
                 {"$1<a href='"._mk_textmate_link($2, $3)."'>$2 line $3</a>"}ge;
  $c->response->body($error_msg);
}

use Cwd qw( abs_path );
sub _mk_textmate_link {
  my ($file, $line) = @_;

  my $abs_file = abs_path($file);
  return "txmt://open/?url=file://$abs_file&line=$line";
}

It works for me so far. If this breaks anything for you, you get to keep both parts.

Here is a sample of the output with this hack applied (click for bigger version):

Safariscreencapture004

Update: a new version. The big change is the use of the abs_path method to make sure you get the absolute path. This solves problems that I was having with symbolic links. TextMate was opening a new window, because the project and the file path in the error message had different prefixes.

March 07, 2008

perl warnings

You have to learn to ignore the forrest.

There are some perl warnings that hide the real problem. My most hated perl warning is this, the first three lines below:

"my" variable @prob masks earlier declaration in same scope at sbin/some_script line 1640.
"my" variable $count masks earlier declaration in same scope at sbin/some_script line 1641.
"my" variable $t masks earlier declaration in same scope at sbin/some_script line 1642.
syntax error at sbin/some_script line 1504, near "next "

Those lines are there because the parser had to bail out after detecting the error on line 4, and failed to notice the end of scopes.

This could be less of a problem if the warnings and errors where ordered by line number, but they are not. So learn to look at the line numbers first to decide which warning to pay attention to.

March 04, 2008

Perl hackers using Git

I was wondering if all Perl hackers that are using Git, do you keep a separate repository for each module, or a single repository for all your modules?

February 04, 2008

You'll be assimilated...

... eventually :).

I actually did use IPC::MorseSignals to try a stupid idea. It worked as advertised, although it did not solve the idiotic nature of the solution.

But it sure was fun.

January 25, 2008

Bug of the day

In a project for a client:

# remove duplicate raw lines
my $dups = $self->dups;
return if $dups->{raw}++;

Should have been:

# remove duplicate raw lines
my $dups = $self->dups;
return if $dups->{$raw}++;

No wonder all lines except the first where dups...

January 11, 2008

Help.pm

Amazing article by Mark Dominus, explaining the code of a simple but very useful class Help.pm.

January 08, 2008

DTrace'ed perl

Cool. Bleadperl runs without delays with dtrace enabled.

I wonder if this will make it to 5.10.1, hope so.

December 20, 2007

cpan tricks

Our beloved cpan command line has some tricks up his sleave. In case you haven't read the fine CPAN manual in a while, let me point out some features I'm using right now to install all the needed modules for my day-to-day operation.

cpan .

You have a local directory with a module already unpacked, or your own personal module. The usual way to install them is doing the dance:

perl Makefile.PL
(manually deal with missing dependencies here)
make
make test
make install

The second step, the missing dependencies part, is the not so good part of the whole experience. Module::Build authors and users would suggest that the first step is the really not so good part, but I digress.

A better way is to do:

cpan .

This will start a CPAN shell, and run the install process on the local directory, including fetching dependencies from you preferred CPAN site.

failed command

Inside the shell, after you installed a long list of modules, the failed command will list all the modules that failed the tests and did not install.

o conf init /REGEXP/

The command o conf init will go through the entire configuration process. In recent versions it has become a long process and one of the things I tweak from time to time, the URL list of CPAN mirrors, is the last item asked.

To speed up the process you can o conf init /urllist/ and only configuration options matching urllist will be asked.

Don't forget to o conf commit at the end.

The smart CPAN urllist

The urllist parameter lists the CPAN mirror sites that the shell will try to use to fetch the packages.

My favorite site is a local CPAN::Mini mirror, the best 750Mb used space I have on my systems. It allows me to install any module from CPAN even when I'm offline.

But this requires that I keep updating the mirror, and sometimes, I just forget. And I don't want to cron it because I don't want to run it while I'm connected via UMTS.

A trick here is to set urllist to several CPAN mirrors in your country and include your local CPAN::Mini mirror at the end with a file: URL.

CPAN is smart and will fetch the CPAN indexes from one of the sites (see o conf init /random/ for some fun) but will always try to download the actual package from the file: url first.

So you get up-to-date indexes for free.

5.10

Just a quick word pointing out that perl 5.10 was released the 18th, 20 years after the first perl release. As Grubber says, Perl is the best language in the world. At least to both of us.

The 5.10 release adds a lot of improvements to Perl over the 5.8.x series. I would recommend that you take the time to read through perl5100delta, it should take less than an hour, but its definitively worth it.

My list of favorite features, in order:

The first will make DBIx::Class, Catalyst, and Moose faster and the second provides a clean path to add new features in future versions of perl.

The third is just too cool for words. You have to see it in action. It's a back-port of a Perl6 operator. As always, Perl has no shame to copy from the [rb]est, even from projects still in development.

The last one brings perl on par with other engines out there, and adds a nice non-recursive implementation and other performance improvements.

Congratulations to all perl-porters out there.

December 04, 2007

Catalyst Advent Calendar

It's back! The new and improved 2007 Catalyst Advent Calendar. (Un)Fortunately its not a swim-suit edition yet.

The entry for today is about using the Open Flash Chart component, my favorite Flash-based chart component. It also uses the recent OFC::Chart module.

I just wish that the Calendar had a title attribute on each daily link. Finding the correct article can be painful without that.

December 02, 2007

Perl preforce-to-git conversion

Very cool job by Sam, the current status of the Perl preforce-to-git conversion.

Preforce was indeed very advanced ten years ago.

November 20, 2007

Wide-Finder and Perl

The Wide-Finder project has been a lot of fun. For those who think that it is not a real-world problem I'll leave you with a quote from the Wide-Finder results page:

Worth Doing · There is a steady drumbeat of commentary along the lines of “WTF? This is a trivial I/O-bound hack you could do on the command line, ridiculously inappropriate for investigating issues of parallelism.” To which I say “Horseshit”. It’s an extremely mainstream, boring, unglamorous file-processing batch job. There’s not the slightest whiff of IT Architecture or Social Networking or Service-Orientation about it. Well, that’s how lots of life is. And the Ruby version runs faster on my Mac laptop than on the mighty T5120. The T5120 is what most of the world’s future computers are going to look like! Houston, we have a problem, and the Wide Finder is a useful lens to examine it.

Anyway, last night I noticed that the top spot is now a Perl program by Sean O'Rourke. Its an extremely simple program, about 60 lines long. Worth a quick read.

Congrats Sean!

November 08, 2007

Test::Harness and Devel::Cover

I've been using the alpha/beta's versions of Test::Harness for a couple of weeks now, and I'm very happy with them.

This week, Test::Harness 3.0 was released so it should start to flow naturally to all of you unsuspecting users. There are people more qualified than me to tell you what was changed and why it is a worthy upgrade.

One thing that I noticed during the beta phase was that the HARNESS_PERL_SWITCHES environment variable is no longer supported, so the classic way to run your tests under Devel::Cover:

 HARNESS_PERL_SWITCHES=-MDevel::Cover prove -l t/*.t

no longer works.

Until bug 25559 is fixed (patch is available in the ticket), the last recipe in the Devel::Cover perldoc works:

PERL5OPT=-MDevel::Cover prove -l

(you no longer need t/*.t with the new prove).

Happy testing!

October 10, 2007

MySQL Community server with CentOS 5

In case I need this again. After installing the official RPMs from the MySQL site for Redhat Enterprise Linux 5 on a fully patched CentOS 5, the startup script does not work properly, failing to start the server:

[root@centos5 log]# /sbin/service mysql start
Starting MySQL Couldn't find MySQL manager or server        [FAILED]

One possible solution is this patch. To apply do:

cd /etc/rc.d/init.d
patch -p0 < PATH_TO/mysql.rc.patch

And be done with it.

Also useful for DBI-related work is this. Log in to MySQL server as root and do:

grant all privileges on test.*  to 'melo'@'localhost';

Adjust melo for your local username.

This will make the installation of DBI and DBD::mysql work out of the box with cpan, testing everything in the process.

September 13, 2007

Sane policy for cyclic module dependencies

In medium-to-large Perl projects (case in point has 164 .pm files, far from over), cyclic module dependencies can become a issue.

Usually I use a lot of Module::Pluggable stuff, coupled with registry-style approaches. So each module is dynamically loaded based on his namespace using Module::Pluggable and registers itself into the core application.

The problem usually crops up when some registry class also requires a module that uses its services. For example, a registry module that uses a DBIx::Class schema, and some of the Tables use that same registry.

If your Registry class uses the your Schema, to make sure it is loaded, then the definition of the registry method will not be available when the Table classes are loading.

There are several possible solutions to this.

First, you could simple split your Registry class into a simple Registry and Registry::DB methods. This way, the Registry will not have the Schema dependency and all will just work. I don't like this much because the split is not natural in some situations.

Another approach is to limit the uses on each module to the minimal number of classes you need to load the module. In this case, the Registry doesn't need to use the Schema class if he doesn't require it during the compilation phase. If you only use the Schema in some methods, you only need to make sure it is loaded at the time of the method call. This can be scary if you look at the code and see a module being used without the proper use first.

A third approach is to do the least amount of stuff during compile time and use the INIT or CHECK blocks to do the proper initialization of your modules. This would be my favorite but I'm uncertain of the problems with mod_perl. (on a side note, be sure to check the proper documentation to learn about these Perl blocks: the begincheck program in there is very enlightening).

In the end, I usually use a combination of the first two. I do plan to see what are the caveats with INIT and CHECK in mod_perl, given that it would be a much better approach.

September 06, 2007

if unless

In Perl, you can do something like VERB if CONDITION or VERB unless CONDITION.

There is not clear rule about which one to use, and I can say that I've preferred one of the other at different parts of my Perl-life.

Recently, I've settled on a simple rule to decide which one to use. The decision is based on the type of CONDITION that follows.

For me, something like 'CONDITION && CONDITION' is much easier to read and understand than 'CONDITION || CONDITION'. The reason is simple: the first, using &&, has less positive outcomes, and usually thats the one you are testing for, so my brain doesn't need to keep several possible options in my memory. I can short-circuit and forget what was before the && as soon as I decide that it is true.

So my current rule is simple: use the one who converts your condition to a &&.

So if I have:

return unless $estado eq 'clean' || $estado eq 'dirty';

I rewrite it as:

return if $estado ne 'clean' && $estado ne 'dirty';

update: I need sleep, really. Updated the examples to make some sense. Thanks to Pedro Leite.

August 22, 2007

XMPP-based notification of incoming emails

A nice tool that you can use to receive XMPP notification on incoming mails.

Some facts:

  • uses inotify to detect new emails, so Linux only;
  • Maildir-based deliveries only;
  • Perl script.

My personal attempt at this was not as simple as this one. The current idea is based on qpsmtpd and ActiveMQ but its still in the "back of the brain"-stage.

August 21, 2007

I need some sleep...

I just wrote this:

  $value =~ s/(\s)\s+/$1/g; # Highlander filter for white-space

Yeah, sleep deprivation does that to you...

August 18, 2007

DBI and Async loops

Most of my time, I program inside async event loops like Danga::Socket or POE.

Accessing DBI inside those loops is not a straight forward thing. Most solutions involve forking worker threads and using pipes to communicate between my script and those workers. There are a couple of components for POE that do most of the work out of the box, like POE::Component::EasyDBI, but still, it feels a lot like an hack.

For Danga::Socket loops, I've been working with two "simple" solutions:

  • split the work between sync and async tasks, using disk-based storage to move work from one side to the other;
  • use HTTP-based REST web services.

There are two more solutions that might work now. The first is the amazing DBD::Gofer. I haven't played with it yet (look over the Tim Bunce presentation at CPAN to get an overview) but it simplifies the client side of things that it might just be possible to tweak it into a async DBD driver. The DBI API would have to be slashed a bit, I don't think it has a async version.

Gofer is nice, but will still require a HTTP server for the Gofer servers. And if I have a HTTP server, I might prefer to have a higher level API that can also group some queries in a single call, some of them could even involve a transaction that Gofer does not support.

The other solution is to use Gearman. Its fast, and seems to have all the niceties for scale (multiple workers, multiple managers). But it is not reliable, at least not until the client decides to make it so with code.

All in all, I think both solutions are good, and you can even use Gofer for some things, and Gearman for others. Heck, you can even use Gearman as back-end for Gofer.

For now, I think I'll try Gearman, it seems less work, and I'm extremely lazy. But I'll get back to Gofer soon. I would love to see a asyncronous DBI API, and DBD::Gofer might just be the door.

Startup performance of DBIx::Class

In a project I was working on, I had some performance problems to startup a DBIx::Class schema with about 75 sources. It took about 19 seconds to startup.

After a quick thread in the mailing list, the startup time is now 2 seconds.

The two-part solution is this:

  1. move all your load_components() into a common class and use that class as the base for your sources;
  2. use the schema provided load_classes(), its very fast. If you need per-source tweaking, do it afterwards looping over Schema->sources().

Many thanks to the dbix-class mailing list, in particular mst and Hartmaier Alexander for the tips in the right direction.

May 22, 2007

Module of the Day: Text::Unidecode

For all of those moments where you would write:

s/áéíóúàèìòùãõâêîôûç/aeiouaeiouaoaeiouc/g;

Now you can do the right thing with Text::Unidecode.

April 20, 2007

Module of the day: IPC::JobQueue

There I was cleaning up my "presentation inbox" when I came across the Scary Jifty presentation by Jesse Vincent. He's crazy, in a very sane sort of way.

Anyway in the middle of the presentation he mentions IPC::PubSub which is a nice publish/subscribe system that you can embed into your own applications.

But the prize came in the SEE ALSO section of IPC::PubSub: IPC::DirQueue. I can't remmember how many times I needed something like this. Amazing stuff.

It allows you to create a set of worker processes sharing a single job queue. Clients queue jobs, and a worker gets it. The system assures you that each job will only be processed by a single worker.

Very simple and nice.

I need to read the code and compare it to Gearman, given that IPC::DirQueue also includes a TCP-based server.

April 16, 2007

Tip: XML::LibXML and Debian

If even after you apt-get install libxml2 libxml2-dev you still can't install XML::LibXML, look for this error message:

 CPAN.pm: Going to build P/PH/PHISH/XML-LibXML-Common-0.13.tar.gz

enable native perl UTF8 running xml2-config... ok looking for -lxml2... no looking for -llibxml2... no libxml2 not found Try setting LIBS and INC values on the command line Or get libxml2 from http://www.libxml.org/ If you install via RPMs, make sure you also install the -devel RPMs, as this is where the headers (.h files) are.

Basically, the installation of XML::LibXML::Common failed.

To solve, try this:

apt-get install zlib1g zlib1g-dev

Aparently libxml2 depends on -lz but the .deb doesn't notice or something.

February 23, 2007

Module of the day: encoding::warnings

Today I was tracing a double-encoding error in a web app. The Fun never ends!

Anyway, with encoding::warnings the problem was easily spotted.

So, if you are getting "garbage" output on some of your Perl scripts. do yourself a favor and try it. A good tip from an excellent write up about UTF8 and Perl.

And remember, boys, girls, and mix-ins, $1 after m/(.)/ is a character, no matter how fat he is.

February 22, 2007

Module of the day: File::Inplace

The task was simple enough: edit a already existing file, to change something, keeping the name intact, and if possible doing a backup of the original version.

We could start dealing with all sorts of errors in open, rename and friends, dealing with temporary files and all that stuff.

Or we could just jump to CPAN:

cpan File::Inplace
perldoc File::Inplace

In my case, I wanted to remove a line from a file. A regexp matching the line in question was something like this:

qr/src="trivantis-titlemgr.js"/

So the code becomes:

  my $editor = File::Inplace->new(file => catfile($path, 'titlemgr.html'), suffix => '-'.time().'.bak' );
  while (my ($line) = $editor->next_line) {
    $editor->replace_line(undef) if $line =~ qr/src="trivantis-titlemgr.js"/;
  }
  $editor->commit;

Nice and sweet.

'update:' to clarify some point raised in the comments, yes I know about perl -i.bak, but I needed this inside a Catalyst web application, and thus this module.

February 05, 2007

Tip: given a coderef, show me where the code is

In Perl, sometimes you have a coderef, normally a callback, and you need to know where is that code located in your source.

This will do the trick:

sub _dude_wheres_my_coderef {
    use Devel::Peek;
    $code = \&$code; # guarantee a hard reference
    my $gv = Devel::Peek::CvGV ($code) or return;
    return *$gv{PACKAGE} . '::' . *$gv{NAME};
}

Taken from dump_vars.pl. BTW, $name also includes filename and line number.

January 02, 2007

Tip: deleting all objects in a many-to-many relation with DBIx::Class

This might be obvious to many of you, but it wasn't for me, so here it is.

If you have a many-to-many relation in DBIx::Class, you can remove all the relations with:

$self->set_RELATION_NAME([]);

The set_RELATION_NAME method is created automatically for each many-to-many relation you setup. It can receive a list of objects that will become related to $self.

The obvious

$self->set_RELATION_NAME();

will croak on you.

GraphViz under Mac OS X

In case you are trying to install the Perl module GraphViz under Mac OS X 10.4.x, you might need this script.

First, install a decent version of GraphViz. I use this snapshots of Graphviz by Ryan Schmidt. Be careful not to use this version on MacIntel's, the package is PPC-only.

After this is done, add /usr/local/graphviz-2.12/bin to your PATH.

Second, while installing the GraphViz module via CPAN, one of the dependencies is IPC::Run, which might fail on your system. It's a know issue with FreeBSD that seems to also be present on Mac OS X. It hangs while testing t/pty.t.

To skip those tests, do this:

  1. inside cpan shell, type look IPC::Run;
  2. edit t/pty.t, go to line 97 (looks something like this: my $platform_skip = $^O =~ /(?:aix|freebsd|openbsd)/? ...;
  3. add |darwin after opendbsd, save and exit;
  4. type the usual perl Makefile.PL && make && make test;
  5. if all goes well, type sudo make install or the version that works for you.

After that, you should be able to install GraphViz without any problems.

December 31, 2006

MySQL and UTF8 support

I've talked about this before. Basically, if you use UTF8 in your database, the scalars returned by DBD::mysql wouldn't have the utf8 flag turned on for text fields.

Until now.

With the latest release (DBD-mysql-4.0000), things have changed. A patch from Dominic Mitchell was applied (in DBD-mysql-3.0008_1 according to the changelog) that does the right thing regarding UTF8 text fields.

Check the documentation (search for mysql_enable_utf8) to see how it works. You can also look at the utf8.t test script for examples of usage.

I'll post back my findings after I play with it.

May 06, 2006

Tip: reset the TOP format in Perl

Here is something that took me a while to get.

If you are writing scripts in Perl to do some reports and statistic analysis, and you are not using format, stop right now, do a perldoc perlform.

Then, in case you need to switch reports in the middle of the script, for example if your script dumps the data collected in three different layouts, use this code between each report:

$~ = 'REPORT_NAME';
$^= 'REPORT_NAME_TOP';
$= = 100;  # Number of lines on each page
$- = 0;    # Force the print of the new header

You can use English and use the long versions of these variables. See perldoc perlvar for those.

The $- is the one that had me pulling my hair. If you switch to a new layout and start writing data, the system doesn't know that this is a new layout, and will not print a header until he needs to switch the page. Setting this variable to 0 forces a page feed, and forces the new header to be printed.

Technorati Tags: ,

March 11, 2006

Lisbon.pm: tech meeting and a new course

Last thursday we had another technical meeting of the Lisbon.pm group. It was a great success, with 29 people attending.

There where three presentations: one by João Gomes about Catalyst, another by me about POE with an example of process control, and the last one by Miguel Duarte about when not to use Perl, which was, as you can expect, a hot topic.

If you are interested in Perl and live or work around Lisboa, please join our mailing list (instructions can be found at the Lisbon.pm website).

The meeting was organized by José Castro, and the space (and first round of drinks afterwards) was sponsored by Log. Kudos to them both.

I've been the Lisbon.pm leader for some time now, and since the reactivation of the group last September, our social meetings have been better each time and our technical meetings have also been great.

Yet most of the work of organizing our events is being done by José, so it's only fair to make him the leader of the group. So after forcing^H^H^H^H^Htalking with him about this, he finally accepted.

I think the group is now in better hands.

Technorati Tags: ,

December 29, 2005

utf-8 and DBD::mysql

After an afternoon trying to understand why some of my output from a utf8 table in MySQL was coming out garbled, I finally realize that:

  • even if your tables and database are all created with utf8 charset;
  • even if you set your connection charset to utf8 with SET NAMES 'utf8';

your scalar results in perl will not have the utf8 flag set, so any print, concatenation or XML generation further on will result in a mess, when finally printed out to a XMPP stream, for example.

So, on all your code, after you retrieve data from MySQL, you must set the utf8 flag on that scalar.

For now I'm using this code. Probably not the best one, but it suffices for now.

if (! utf8::is_utf8($message)) {
  utf8::decode($message);
}

There is some discussion about this online. It seems that the DBD::mysql people are waiting for a general solution for the problem to appear in a future version of DBI. There is also a patch floating around that sets the flag on utf8 content.

If you use Class::DBI, you can also look at Class::DBI::utf8 that does the right thing.

Regarding support for this in DBI itself, there is a thread by Tim Bunce that talks about utf8 support in DBI in a future version, in particular bullet 4 of the initial post. But the next bullet points the responsibility of the utf8 flag to the drivers.

This quote in particular should self explanatory about Tim's reasoning:

Some features, like charsets, vary greatly in how they're handled by database APIs. For these kind of features the DBI usually lags the drivers. Once a few drivers have implemented their own driver-specific interfaces, and had them proven as practical by users, then I can work with driver authors to see how best to extend the DBI API in a way that'll work well for those drivers and others.

And a more specific one regarding DBD::mysql:

Basically it should be the job of the drivers to set the uft8 flag on data being retrieved if it is utf8. I believe that the new mysql v4.1 protocol does provide information about the characterset of each colum. DBD::mysql can use that.

I would like to see that patch into the DBD::mysql mainline. It seems that Tim Bunce is passing on the responsibility of the utf8 flag to the driver author. It makes some sense. If the DBI layer was responsible to set the flag, it would need to obtain charset information from the DBD driver anyway. In that case, if the driver already knows which charset it is using, why not just set the flag? This would make it easier to work with utf8 in the meantime...

Stay tuned for the next chapters in the utf8+DBD::mysql saga...

Update: another interesting link about MySQL, utf8, and Moveable Type.

Technorati Tags: , , ,

December 14, 2005

Ruby on Rails 1.0

Ruby on Rails hits 1.0. Congratulations to all in the core team.

Since discovering Catalyst, I've left Ruby on Rails behind me, but it was RoR that got me hooked in Ruby, and for that I thank them. Ruby really is a lovely and clean language.

I can only hope that Perl6 cleans some of the "issues" with Perl5. And from what I can see from the presentations I've seen in the last year, it will :).

Technorati Tags: , , , ,

September 30, 2005

The state of the Onion

As usual, a must read. The State of the Onion n. 9 is here.

Favorite quotes:

Wu-Li isn't actually Chinese. He only thinks he's Chinese because when he was young his parents told him that every third child born into the world was Chinese, and he was a third child.

and another:

As I was thinking about the intelligence community and its recent obvious failures, it kinda put a new spin onto the phrase, "Information wants to be free," or my own version of it, which is that "Information wants to be useful."

and this rings some bells:

We often think that intelligence failures are caused by having too little information. But often, in retrospect, we find that the problem is too much information, and that in fact, we had the data available to us, if only it had been analyzed correctly.

So I'm just wondering if we're getting ourselves into a similar situation with open source software. More software is not always better software. Google notwithstanding, I think it's actually getting harder and harder over time to find that nugget you're looking for. This process of re-inventing the wheel makes better wheels, but we're running the risk of getting buried under a lot of half-built wheels.

Perl6 polyglot

Very nice article on O'ReillyNet about Perl6.

Perl6 Polyglot by Geoff Broadwell -- What Perl 6 is learning from other languages -- and why you should learn a few too.

Go! Read!

August 26, 2005

Week summary

No, I'm not going to talk more about Google Talk :).

I'm going to bed now, I've been up since 5am to do an upgrade to our XMPP server platform. It went pretty well, but we still have some glitches with S2S. They should be solved real soon now.

I wanted to talk you about the stack of perl books I have here, from Higher-Order Perl to Perl Testing going through Perl Best Practices, but it will have to wait until next week. I need to make them justice, they deserve it. It's really nice to see very good books about Perl being published.

I've also been reading the Programming Ruby and the Rails book. Yeah, I know, some of you think it's just a fad. Indulge me, ok? I'm still recommending anybody who wants to pick up a language, to try Ruby. I've collected some presentations about it from OSCon'05 that I'll link next week.

I've also finished my Trac setup, and it is now totally automated to keep me using darcs or, at work, CVS, and have all the niceties of the great Wiki, Tickets and Source browser integration. I'll be moving my lost projects to it in the coming weeks.

Finally, I updated my personal Jabber server to the latest Jive Messenger. I had been using ejabberd for some time now, but I wanted something that I could just install, run a script and be done with it. I can only say that I was able to do just that with Jive Messenger, I think it took me 15 to 20 minutes to have a fully featured XMPP server. The newest version, 2.2, it's the first one I can use since it's the first to have S2S. They don't have TLS, but it's not important to me right now: my client of choice, Psi (who incidentally also had a new release today, check the new Psi wiki, btw), does not support TLS for now. But if TLS is important to you, don't despair: adding TLS to Jive Messenger is a Google Summer of Code project.

Also, Moveable Type 3.2 is out, looking good. For those of us who still like to run and waste our time tinkering with our blog software, it's seems a worthy upgrade. I was very impressed with the upgrade demo. I was thinking on upgrading my software, lot's of bugs, and more trackback spam than I can read in a lifetime, and I was shopping around for software. MT 3.2 seems to be crawling back to the top of my list of choices, followed by typo and wordpress, in that order.

With all the Google Talk, I've been a bit away from Apple news. The IDF conference this week had a lot of cool stuff (my favorite, the end of the north bridge on Intel chipsets, you should check AnandTech site for all the goodies. I didn't have the time to read to much into it, regarding Apple future hardware offers, but it looks promising.

Yet, my interest regarding Apple news will grow in the coming weeks. The Apple Paris Expo is coming up in late September, and I would like to see a last rev of G4 Powerbooks coming out of Mr. Jobs keynote. I would be a happy camper to see a dual-core G4 17" or bigger Powerbook... But I'll try to keep my expectations low.

All in all, a fun week. My only regret is that my coding time is getting to be a lot less than it usually was. And now that it was getting to be fun again. I've been making an effort to do more developer testing, and so far I'm loving it. It can be hard sometimes, with the pressure of every day, but it's a great stress reliever for me. I should join Phalanx some day, I really should.

Have a nice weekend.

Technorati Tags: , , , ,

August 23, 2005

The future is here

If you don't have the time to keep up with all that going on in the perl6 world, you can get a taste of the language now. Rafael Garcia Suarez has written a IRC bot named Shakti.

July 02, 2005

Weaken references in Perl

In case you need circular structures in Perl, you should known about Scalar::Util and it's weaken function.

See this example:

package T;

sub DESTROY {
  my $self = shift;
  print STDERR "Bye bye: $self->{name}\n";
}

package main;
use Scalar::Util qw( weaken );

{
  my $a = bless { name => 'a' }, 'T';
  my $b = bless { name => 'b' }, 'T';

  $a->{b} = $b;
  $b->{a} = $a;
}
print "Should see a destroy 'a' and 'b', but you wont...\n";

{
  my $c = bless { name => 'c' }, 'T';
  my $d = bless { name => 'd' }, 'T';

  $c->{d} = weaken($d);
  $d->{c} = weaken($c);
}
print "Should see a destroy 'c' and 'd'!\n";
print "Now you'll see a destroy 'a' and 'b'\n";

The output is this:

Should see a destroy 'a' and 'b', but you wont...
Bye bye: d
Bye bye: c
Should see a destroy 'c' and 'd'!
Now you'll see a destroy 'a' and 'b'
Bye bye: b
Bye bye: a

The problem is that in the first block, although the $a and $b are no longer in scope, each one holds a reference to the other, and that prevents them both from being destroyed.

In the second block, each one of $c and $d takes a weak reference to each other. Weak references don't increment the reference count internal to all Perl variables, so at the end of the block, they are correctly destroyed.

You've warned.

June 04, 2005

Ruby on Rails, and Catalyst

So, after having learned a bit about Ruby, the next logical step for me was to try Ruby-On-Rails.

A bit of background: before going to work at Sapo in the XMPP Messenger product, I was a lead-developer at Novis. My main task there, was to build websites and provisioning systems, and the glue that ties billing systems with technical databases. We used a framework that was developed internally since '99, and in it's latest generation was called Apache::WAF. It was a MVC-style framework, with a strict separation between code and layout. The View was Template Toolkit, but it was extensible to other technologies (we used Mason a lot also). The Model where our own provisioning system libraries and other stuff. Apache::WAF allowed us to build the Controller very quickly. Also, it was pretty fast, using mod_perl.

So using MVC-style frameworks in web development is nothing new to me.

Except that Ruby-On-Rails is much more than that, and in a extremely clean package. You can get up-to-speed in no time, thanks in part to the generation of code that is the base of RoR, and also to the Screencasts available at the Ruby-On-Rails website.

Also, having a clean and powerful language behind it, it sure helps.

It's not perfect. The part of mapping URLs to controllers could be better. RoR uses the same setup that Apache::WAF uses: you register URL namespaces on a special config file and associate them with controllers. The main advantage I see with this approach is that you can look at this single config file and know all the URLs of your app, so clashes are less likely.

The Model part of RoR is just excellent. As far as I can tell, there is no Perl equivalent to ActiveRecord. Yes, yes, I know about Class::DBI and Alzabo, and the others. But those solutions solve the CRUD problem: creating, retrieving, updating and deleting one object at a time. They all have solution for forging relations between tables and objects. But ActiveRecord does all that and more. It's the first framework (and I really expect and would love to see some perl guru to jump on me on this one and say "Have you looked at X on CPAN?"...) I've seen that solves the 1+100 queries problem efficiently.

The 1+100 queries problem is simple to understand: suppose you have a Books table and a Authors table (for simplicity, a book can only have one author), and you want to list all the books with the respective authors.

With a basic CRUD framework, you do a Books->find_all or Books->search to get all the books and for each one, you do something like $book->author->name to get the author name. If the first query for books returns 100 books, you'll then do 100 queries for authors. Yes, yes, you could cache ID->name of authors, but that really is not the point. The point is that something that should be done with a simple join at the database level with a single query, is now being done with multiple queries (and if you write to me saying that "Yeah, but my MySQL is fast enough you wont even notice", I'll warn you right now that I'll verbally abuse you. Real world is a lot different).

ActiveRecord is able to solve this. You can say that "Please give me all the books, and also their respective authors" and with a single query to the database, you'll get a list of Book object with the author method returning a pre-created object. You can even specify which fields you need from author.

It's very very good! The Model in Ruby-On-Rails is for me the part that makes the most difference to the other MVCs I've used in the past.

Ruby and Perl

I've been learning Ruby for the last couple of weeks. I bought the book some time ago, and I've been reading and rereading it several sections.

Ruby is a very very good language. Blocks are one of the most productive features I ever used in a programming language.

Yet, I use Perl everyday. I'm a Perl user since 4.something patch level 35 or 36, and I really like perl5. Also, at work, I need Perl and POE to build XMPP components. It's a very productive setup. Also, after more than 10 years in Perl-land, you get to know the language, the modules and the community very well, and that helps.

But I can't stop thinking that Ruby would be useful in a lot of situations at work. Right now, I think there are 4 or 5 people there looking at Ruby, but all with a "Wouldn't it be great to use this?" type of thinking, and no "Let's use this right now". I understand exactly why we think like that, it's a very big company, a long-lived project, with a lot of legacy, and frankly, the right project for Ruby doesn't seem to have emerged yet.

I've been looking for XMPP stuff in Ruby. There is Jabber4R which is very good to do a bot in a hurry, but I lack a event-driven framework like POE in Ruby so that I can write my external components.

So I'll keep using it for small things, to keep the language alive in my brain. I'll try to figure out how to achieve critical mass around me to start using it in a big way.

Of course, I must also think about perl6. But I'll confess: for a long time perl5 programmer like me, perl6 is more scarier than Ruby. It seems (I haven't read that much about it, I just keep up with the Mailing list summaries) very powerful, but I think that the language is getting more cryptic. I don't know if that's a language feature, or the current perl6 guru's showing off the power and expressiveness of the language. Anyway, for me, it's scary. I've been fighting the last 4 or 5 years to write and teach other people to write good maintainable perl5 code, and I haven't seen examples of maintainable code in perl6.

This year I'll be attending YAPC::EU and probably YAPC::NA later this month. And my main goal is to see what perl6 is all about, and if it is as scary as it seems.

May 28, 2005

Perl6 example

Perl6 JAPH:

say [~] (-> @c is copy {gather { while @c[0] { for @c -> {take(.shift)} } } }(['Joec','utrk','shle','te6r',' r .','a h.','nPa.'].map:{[split "",$_]}));

Something tells me that learning Perl6 is going to be an adventure. And totally different from my perl4 to perl5 conversion.

Either that, or I'll be sticking with perl5/ponie for a looong time.

April 30, 2005

XMPP and Perl

I'm writing an external component in POE::Component::Jabber. It's a custom development that will link to a XMPP server.

POE::Component::Jabber is nice, but it seems to me that I'm investing in something that I won't be able to reuse.

There is no clear way to create a standalone module or plugin to hide the details of a certain protocol (like the jabber:iq:register).

I have some old code, XMPP::Session, that should solve this problem. The module is network agnostic: you can use STDIN/STDOUT, POE::Component::Client::TCP or even a Socket. I really need to clean it up and share it to see if there is any interest in it.

Tiger notes

As I said before, I'm not upgrading to Tiger until sometime next month. Yet, I need to keep some links to things that I need to test or fix in my Tiger setup.

The good news in Tiger for scripting fans like me, is that Perl and Ruby have decent and recent versions. The bad news is that at least Ruby has a bad configuration file. From what I could understand, it seems that they left a CFLAGS=-arch i386 in a config file... I can't wait the rumor sites to get this one... A fix is available.

One thing I want to test is the new CoreData framework. Cocoa Dev Central has two articles entitled "Core Data Class Overview" and "Build a Core Data App" that seem very nice.

I'll keep this post updated with all my things-to-see in Tiger-land.

April 24, 2005

Clarification about last post

The fact that I had to write so little code is a tribute to two things: the design of Apple APIs is very clean; and Ruby integration with ObjectiveC runtime is almost perfect.

Doing a web browser with XCode is equivalent to the early 90's 'Hello world'.

One interesting thing. I'm almost sure that I can reuse the same NIB file with a CamelBones project. I'll try it next week and write the same app with it.

April 01, 2005

POE and Rendezvous

You can find on CPAN my recent itch. I wanted to add discovery to some POE servers I have running on my laptop and on local servers at the office.

Enter