« August 2008 | Main | October 2008 »

September 30, 2008

The Complete incomplete boxed set

Amazon UK is selling a "Complete" boxed set of Battlestar Gallactica, seasons 1 through 4.

Given that the second part of the fourth season will only air next year, how come this isn't plainly false advertising?

Amazon should either correct the description or pull the article.

September 23, 2008

Cheap 1Gb MicroSD cards

Apparently soon you'll have a supply of cheap 1Gb MicroSD cards. You just need to format them.

Some people will never learn...

September 20, 2008

Push to multiple repositories

This I did not know. You can have several URLs per remote in Git, and git-push will update them all.

On a personal project I have this:

[remote "backup"]
    url = git@github.com:melo/perl-sapo-broker.git
    url = git@git:melo/perl-sapo-broker.git

A simple git push backup master and all is well.

Installing gitosis

Gitosis is a wonderful little system to manage Git repositories, providing access over SSH, with tight access control and using only one shell account.

The installation instructions provided with the README.rst, and the Hosting Git article by Garry Dolley provide you most of what you need to install it. But they cover the most basic installation where everything is in your system PATH.

My setup is not standard at all, so the process needs to be tweaked a bit.

Although not mentioned, Gitosis requires a recent version of Python (at least more recent than my system 2.3.4) and setuptools (also missing from my system).

I choose to compile all the dependencies. To isolate this as much as possible, I created an account gitdeps to hold all the stuff I need to run Gitosis.

I logged in as gitdeps and did:

# make sure other users can use this commands
chmod 711 $HOME
mkdir src && cd src

# Install Python
wget http://www.python.org/ftp/python/2.5.2/Python-2.5.2.tgz
tar zxf Python-2.5.2.tgz
cd Python-2.5.2
./configure --prefix=$HOME
make
make install
cd ..
export PATH=$HOME/bin:$PATH

# Install setuptools
wget http://peak.telecommunity.com/dist/ez_setup.py
python ez_setup.py

# Install Git
wget http://kernel.org/pub/software/scm/git/git-1.6.0.2.tar.gz
tar zxf git-1.6.0.2.tar.gz
cd git-1.6.0.2
./configure --prefix=$HOME
make
make install
cd ..

# Install Gitosis
git clone git://eagain.net/gitosis.git
cd gitosis
python setup.py install

You should have all the software needed to run Gitosis now.

The rest of the installation is pretty simple. You need a couple of things:

  • choose a directory to hold all the files: we will assume /home/git but you can use whatever you want;
  • a user account for the system: usually this user is git. You can have several Gitosis installations in the same server, each one using a different user;
  • the SSH public key of the user that will be the initial administrator of Gitosis.

To create the git user, you should use the proper tool for your operating system. The README.rst provides the command to run on a Debian-like system. I'm using CentOS so the command is this:

# As root
useradd \
      -s /bin/sh \
      -c 'git version control' \
      -r \
      -d /home/git \
      git
mkdir -p /home/git
chown git:git /home/git

After this, you just need to initialize the Gitosis system. Do:

# As root
PATH=/home/gitdeps/bin:$PATH
export PATH
sudo -H -u git gitosis-init < /path/to/gitosis_admin_ssh_public_key.pub

You should see two lines of output:

Initialized empty Git repository in /home/git/repositories/gitosis-admin.git/
Reinitialized existing Git repository in /home/git/repositories/gitosis-admin.git/

On a standard system, that would be it. But we have all the binaries in a non-standard directory, /home/gitdeps/bin. To make sure that they are found, we need to tweak the SSH instalation.

First, you need to create a SSH environment file with the proper PATH to use:

# as root
echo "PATH=/home/gitdeps/bin:/bin:/usr/bin:/usr/local/bin" > ~git/.ssh/environment
chown git:git ~git/.ssh/environment
chmod 400 ~git/.ssh/environment

Then you need to make sure that your sshd is configured to read the file. Edit the /etc/ssh/sshd_config file. There are two settings you must check:

  • PermitUserEnvironment: must be yes;
  • UseLogin: must be no.

If UseLogin is yes, proceed with caution. You might break ssh service for other users. One alternative (left as an exercise to the reader) is to use a separate sshd just for the git user.

Restart your sshd. And we are done.

To manage Gitosis, you clone the gitosis-admin.git repository. Inside your local copy, you'll find a gitosis.conf and a keydir/ directory with the public keys of all the users, in the format USER_ID.pub.

# on your laptop/desktop
git clone git@server.domain.here:gitosis-admin.git
cd gitosis-admin
ls -la *
-rw-rw-r--  1 melo  staff  91 Sep 20 15:44 gitosis.conf

keydir:
total 8
-rw-rw-r--  1 melo  staff  666 Sep 20 15:44 melo@simplicidade.org.pub

Have the appropriate amount of fun.

September 19, 2008

Dropbox is open

I missed the announcement a couple of days ago, but Dropbox is now open. They also released their Linux client.

Dropbox is what iDisk should have been: a simple, just works, way to share files between computers. Right now I have a couple of shared folders shared with other Macs and Windows boxes, without any problems.

Recommended.

TextMate Scratches

Scratches is a new TextMate bundle that I'm finding very useful.

Basically it allows you to take snippets of code, "scratch" them, and then reuse them on other places. A sort-of glorified multi-buffer copy&paste. I believe the inspiration came from a BBEdit feature.

The current version (see below for instructions to install) has a very nice scratch viewer. You can see how it looks with the small screencast (authored by Hans-Jörg Bibiko).

The Scratch bundle is still in the review section of the Macromates SVN. To install do:

export LC_CTYPE=en_US.UTF-8
export LC_ALL=
cd ~/Library/Application\ Support/TextMate/Bundles
svn co http://macromates.com/svn/Bundles/trunk/Review/Bundles/Scratch.tmbundle

Tell TextMate to reload its bundles (menu Bundles > Bundle Editor > Reload Bundles) and you should be ready to use it.

The cool part is that this bundle was created in a couple of days in the TextMate mailing list. If you are curious, the threads starts here.

Cisco to acquire Jabber Inc.

Wow... Cisco will own the Jabber trademark.

(via IMtrends)

Perl and database access

The base of all database work with Perl is the DBI module. There is no possible argument about this.

If you need something more high-level, things get dicey.

For the last 2 years I've been using the most excellent DBIx::Class. From all the ORM's that I've used so far, it is the best one out there and I still recommend it if you want to get up and running fast (quick tip: start with DBIx::Class::Schema and the load_namespaces() API, do not use load_classes()).

DBIx::Class has some great features:

  • comprehensive test suite: self-explanatory;
  • extensible: you can add pretty complex things on top of DBIx::Class. You can override every method using the magical Class::C3 foundation;
  • manage schema versions: you can use your DBIx::Class::Schema to manage your SQL schema, including versioning and updates. Some manual tweaking is required, no out-of-the-box tool to do this for you, but that's expected anyway;
  • nested transactions: this is killer feature for me. I've worked on a system (IPGng::SQL for those who know what that is) in the past (2001/2003 timeframe) that has this also, and it makes your code much more robust and simple.

Not everything is a good fit for me, though. There are two main sticky points.

The first is the use of SQL::Abstract as the query representation language.

Learning a new language that I can only use in the context of Perl is a personal waste of time. I have been using SQL with Oracle, MySQL, Postgres and SQLite for quite some time, and I can pretty much do whatever I need with it. Even when I hit a wall, I have a couple of Celko books behind me that usually help me jump over them.

So for me, the SQL::Abstract advantages aren't actually.

The second sticky point is a mismatch between DBIx::Class and my needs in terms of Object-oriented modeling.

Usually I have one entity in my object model (say for example Members) that is mapped to multiple tables (personal data, login information).

With DBIx::Class, I have the business logic dispersed into Result and ResultSet classes, per table, not per entity. This sort of works, but I find it messy for more complex projects.

So for a new project that I'm starting, I'm not using DBIx::Class. I don't have a full replacement, just parts that I'm enjoying putting together.

The first two modules that will be part of the final solution are DBIx::Simple and SQL::Interp.

They provide the minimal set of tools that such lower layer over DBI should have.

Above that, I'm still working on it. So far I have a wish list:

  • transactions: nested transactions with commit/rollback hooks;
  • DSL for schema versions: must have data dictionary features.

The first part will probably be a small Transaction module. I'll most likely use code like the txn_do of DBIx::Class::Storage, I like the syntax very very much.

The commit/roolback hooks are required because I need a "almost" two-phase commit protocol.

Picture this: you have two systems, a transactional database and a non-transaction messaging/pubsub system. Inside a DB transaction, you publish some events. If those events reach subscribers before the DB transaction is committed, they will not find the DB up-to-date.

What I need is to delay the actually publishing of the events until the final commit of the DB.

The second wish is a DSL that I can use to generate the SQL, with tools to manage upgrades between versions. This will be a SQL::Translator-based project, with a Parser class for my DSL.

This allows me to use all the Producers that the SQL::Translator project already has, and also the tools to diff SQL schemas.

The data dictionary part is the important feature. I want to declare types, including Perl code to validate and format values, in a central place, and then use them in the schema.

This DSL must generate also some base set of classes that you can extend with custom behaviors. This also provides a introspection interface that can be used to generate HTML forms and validation profiles.

Its seems a lot of work, and it is. But its also an incremental process going back some years, and I don't mind the wait.

I see you, and raise an Extreme

It seems that the WebKit team just raised their bet in the JS Engine poker game.

Without comparing with other engines, which is a game in itself, the speed up from previous versions of Safari to the current Webkit is nothing short of amazing.

I was looking around for a Safari version 2.x to see how it compares, but I don't have one anymore. Still, a ten-fold increase between the 3.0 release and this one, impressive.

Update: By the way, I'm a Safari user, so comparing to other engines is not something that I really care, I'm much more interested on comparisons against previous Safari versions. Still, you can find another comparison between engines here.

Although other will be more qualified to answer this, I wonder how fair is comparing Squirrelfish Extreme with Google V8 on a Mac? The reason is this: I assume that the V8 running on Windows uses a lot of optimizations that might or might not be in the Mac version.

Ruby svn has some support for Unicode codepoints

A couple of years ago I looked at Ruby to see what the fuss was all about.

I found a clean language with a couple of features that I really liked (the blocks with yield stuff) but also two major annoyances:

  • their version of CPAN was very limited at the time in the areas that I required (namely asynchronous network programming and XMPP libs);
  • lousy support for Unicode handling.

The library has been growing, and although you can find good XMPP libraries now, I still didn't see anything like the AnyEvent Perl framework. I saw someone (forgot where) mentioning something called rev, but so far I could not find any solid references to it.

The Unicode was the most sticky point at the time, and the reason why I didn't keep learning Ruby. I though I'll wait for a ruby implementation with full Unicode support to look at it again.

Apparently, 1.9 might be said release. The latest SVN version has something called String#each_codepoint that seems a step in the right direction.

I'll wait for the final release to look at it again, but its nice to know that something is coming.

By the way, I'm not thinking about stopping using Perl, not in a million years. I just use a lot of Ruby-based projects and I want to add some features on some of them, that's all.

September 18, 2008

AWS newest offer: instant S3-based CDN

The process is simple: upload your stuff to a S3 bucket, call an API, receive a DNS name that you can use.

Done. The bucket is now CDN-ized.

I don't think you could do this any simpler. You can read more about it at the Amazon Web Services blog or at Amazon CTO Werner Vogels blog, where I found out about it.

I wonder how long it will take AWS to become the horse pulling the wagon at Amazon. What I mean is this: AWS was created to export Amazon cloud data-center to other users, but all the solutions that are made available until now are things that Amazon itself required. How long will it be until we start to see features that AWS customers request, and that eventually Amazon itself will use? I wonder if the elastic IP features where such an item.

September 17, 2008

DNS prefetching in Chrome

Interesting tidbit:

A major goal of Google Chrome was to improve user enjoyment and value in web surfing. Critical to that is increasing the responsiveness of the browser to user input, or reducing user perceived latency. Measurements in the browser have shown that a significant amount of time is traditionally spent waiting for DNS to resolve domain names. To speed up browsing, Google Chrome resolves domain names before the user navigates, typically while the user is viewing a web page. This is done using your computer's normal DNS resolution mechanism; no connection to Google is used. As a result, user navigation time in Google Chrome when first visiting a domain is on average about 250ms faster than traditional browsing, and the occasional but painful 1-second-plus delays are almost never experienced.

I don't expect this to add much load to the DNS infrastructure and it should be a visible performance improvement for end users.

Dirac hits 1.0 milestone

Dirac, a GPL'ed wavelet-based video compression algorithm, reached the 1.0 milestone.

I'm not expecting a Apple-official QuickTime component (there is a non-Apple project to create a QT component for Dirac, including encoder), but I'm interested on comparisons with H.264. Anyone?

EFI-x is shipping

I'm thinking on ordering one of these EFI-x dongles, to play around. I wonder how well it works.

My motherboard is supported... hmms...

September 16, 2008

Updated x-git-update-to-latest-version: now with man pages

I don't contribute with code to the git project, so the least I can do is use the master version daily.

As I explained previously, I have a automatic process to do that.

To keep my git up-to-date, I use my x-git-update-to-latest-version script.

This script updates my local clone of the git repo (localy at ~/work/track/git), and then configures, installs (at /usr/local/git-git describe) and updates the/usr/local/git` symlink.

This way, I can have /usr/local/git/bin in my PATH and I'm always using the latest version.

The latest version of this script also installs the man pages. You need to tweak your MANPATH to include the /usr/local/git/share/man directory.

For extra points, you can also run this script as a cronjob. Mine runs at 10:30 in the morning.

Update: Junio is on vacation so the master branch is not being updated. In the meantime you are encouraged to use the Shawn O. Pearce branch of git. I tweaked the x-git-update-to-latest-version to deal with the fact that Shawn's tree does not have tags.

XMPP presentation at Barcamp

The first weekend of September, I went to Barcamp in Coimbra. I was only there for the first day, but I got to know and talk to a lot of people that I usually only read online, which was pretty cool.

I gave a presentation entitled WTF is XMPP? mostly centered around non-instant messaging applications of XMPP. I think it went well, the room was pretty full and I talked a lot with people after the event which plan to use XMPP in the short term.

The organizers did a wonderful job, and I want to thanks Alcides to push me to write the presentation.

Update: cool, the presentation was featured on SlideShare homepage.

Project baseline

For quite some years now, I have been using some compile.sh scripts to setup my baseline system for each project. If happen to work in Portugal, at one of my previous employers, you might find them somewhere in /servers.

(Historical note: the /servers nomenclature was devised between '95 and '97, either at Telenet or IP Global. It later was refined at other companies to include /servers/etc, /servers/logs, /servers/data, and /servers/workspace for specific purposes. You can still find it today at several places in Portugal, given that the *NIX/ISP environment community was pretty small at that time.)

Each compile script (I lost count of the ones I have, from Perl, Apache, mod_perl, libraries, djb-tools) downloads the version of the package that I want to use, configures, builds and installs it on a local directory. Each project has his own version of the baseline.

In the past I used to create a two user accounts for each project, named P and Pbase. The first is used to keep and run all the code for said project, and the second is used to install all the dependencies.

The system has worked pretty well but I wanted to clean it up, so I coded a basic system to replace my compile scripts.

Its pretty raw, but I plan to move all my compile.sh script to it.

I'm sure that this was already done. It has a lot of similarities with RPM or .deb packages, but this one is my own, and I can take it to whatever level I want.

If you are curious about it, you can find the code and rule sets (minimal for now) in the official repository.

September 05, 2008

Auto-save

Jack Moffitt was bitten by auto-save.

My auto-save setup is "Save when TextMate looses focus" but yesterday I was scripting something better that will be a great auto-save post-script.

When I start to work on something experimental, I would like to have a snapshot of every path I take and undo. Sometimes I write some code, and then say "naahhh, wont work ok", and undo it, without any record. And this is bad, because some of those actually were a good path after all.

Right now, the script does some git add voodoo and a git commit, so that I have a commit of the entire workspace each 2 minutes.

I'm still tweaking some details: I would like these commits to exist in a parallel repository, not my main one, so I'm playing with having a second .git control directory inside the main one.

Anyway, if you auto-save could also commit the file to a changes repository, then you would have all the changes since you started.

September 04, 2008

Memory tricks

Interesting read to catch up on current PC architecture. Favorite quote:

One developer we consulted about the issue noted, "consumers are being scammed by [PC] OEMs on a large scale. OEMs will encourage customers to upgrade a 2GB machine to 4GB, even though the usable RAM might be limited to 2.3GB. This is especially a problem on high-end gaming machines that have huge graphics cards as well as lots of RAM."

"Microsoft even changed the way the OS reports the amount of RAM available; rumor is, due to pressure from OEMs," the developer told us. "In Vista and prior, it reported usable RAM, while in SP1 they changed it to report installed RAM ignoring the fact that much of the RAM was unusable due to overlap with video memory." And so many PC users are installing 4GB of RAM in their PCs and thinking that it is being used by the system, when in fact it is no more beneficial than if the RAM were simply poked halfway into the CD slot.

How long until some group of people sues their asses off?

Need script: extract common .pm files from multiple directories

I'm re-factoring an old site where the art of source control went out the window somewhere in the past.

The current problem I'm trying to solve is multiple versions of the lib/ directory, each one with their own copies of the same .pm files, but some of them with local modifications.

As a first step I want to create a single central lib/ that will take files that are the same on all the other directories.

We can automate the process by using Digest::SHA1 on each file. If their signatures match, then they are the same file, and can be moved to the central lib/.

Before I write this, does anybody has this script written? Thanks!

September 02, 2008

Pretty and useful

If you thrive to achieve a stress free life, and keep programming at the same time, I assume that you know how automated testing and test-driven development are an essential tool.

I've been using them for most (not all) of what I do in the last year or so. Basic stuff, using Test::More and friends, and more recently Test::Most and using the basic prove tool and Devel::Cover for extra peace of mind.

I was reading the latest edition of the Test Automation Tips and they mentioned Smolder and another Perl module. I knew Smolder already. It is one of those tools that I always keep on my list to install someday, but never get to do it because I expect it to be hard (without any reason to think that, mind you).

The other module is TAP::Formatter::HTML and it just blew my mind. Simple installation with cpan TAP::Formatter::HTML, a single tweak to my run_all_tests.sh script and gorgeous looking HTML reports. I don't have them online but you can look at a sample report (be sure to click around).

Strongly recommended.

Google and WebKit: a love story?

With a 38 page comic, you can get to know a bit about Google Chrome, the Google browser. Highlights:

  • uses WebKit as renderer;
  • it has his own JS engine, written by Team V8, and it includes a JIT;
  • each tab is a separate "process" running inside a jail or sandbox;
  • Gears is built-in;
  • allows to run Web-based app in a chrome-less window;
  • project completely open-source: I think they mean source available, but maybe I'm a pessimist regarding Google openness.

Questions for the next days:

  • When they talk about processes, is it really a new process in the UNIX sense? It seems so, and if yes, it is a good idea;
  • When they talk about process jail, are we talking in the BSD sense, or chroot sense, or something more soft?
  • The plug-ins are made to be (and probably with a grain of truth) the bad guys. Adobe and the Flash team must be ecstatic.

The big winner seems to be the WebKit project, this is the second time Google chooses WebKit over the competition.

More than a browser, Chrome seems tailor made to run WebApps, like the ones Google has been building in the last years, and therefore another step in their quest of operating system independence.

Update: a blog post about Chrome at the official Google Blog. First beta will be Windows-only unfortunately.

September 01, 2008

I don't write academic papers

Bittorrent RAID, cool.

He should write academic papers, though.