Main

July 21, 2008

Tweaks to MySQL installation

After you install one of the MySQL packages available for the Mac, there are some steps that you should do.

First, make sure your MySQL installation knows about time zones. This is important if you want to run your MySQL in the UTC time zone.

To update the mysql database time zone tables, do:

mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root -p mysql

Type the password at the prompt (hit enter if you don't have one yet).

And then make sure all your date/datetime fields use the Highlander-timezone. Edit my.cnf and add:

[mysqld]
default-time-zone=utc

Second, make sure your server is using UTF-8 everywhere. Add to my.cnf:

[mysql]
default-character-set=utf8

[mysqld]
character-set-server=utf8

Third, set mysql to strict SQL:

[mysqld]
sql-mode="TRADITIONAL,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY"

Finally, make sure you use InnoDB by default:

[mysqld]
default-storage-engine=InnoDB

There are probably more tweaks to make your MySQL saner. This are the one I feel comfortable recommending.

DBD::mysql and db_imp errors

I'm installing all my MySQL stuff in a Leopard 10.5.4 desktop and I made some mistakes along the way that I though about documenting here for future reference.

First, although the hardware and OS are 64-bit in a lot of places, the standard perl installed is not one of those. So stick with the i386 MySQL package (or try a 64bit server, but use the 32bit client...). I'm using the Proven Scaling MySQL packages mentioned earlier, and I'm happy so far.

Second, make sure your regular user has all privileges to the test databases. I just do:

melo@DogsHouse:lib $ mysql -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 108
Server version: 5.0.62-enterprise-gpl MySQL Enterprise Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> grant all privileges on test.* to 'melo'@'localhost';
Query OK, 0 rows affected (0.00 sec)

mysql> quit
Bye

This will make your DBD::mysql tests much happier.

Third, in case you see failing DBD::mysql tests with:

Can't use dbi_imp_data of wrong size (127 not 124) at ...

Upgrade your DBI. I'm now with 1.605 and no dbi_imp_data errors anymore. Clean DBD::mysql install.

July 04, 2008

No more bazzilion git-* commands

In case you use the script from the last post, be advised that the current master branch of git.git no longer installs all those git-* on your PATH.

The current git/bin/ contents are:

melo@MrTray:melo $ ls -l /usr/local/git/bin
total 14664
-rwxr-xr-x   89 root  wheel  2826820 Jul  4 10:05 git
-rwxr-xr-x    1 root  wheel   573476 Jul  4 10:05 git-receive-pack
-rwxr-xr-x    1 root  wheel  2826820 Jul  4 10:05 git-upload-archive
-rwxr-xr-x    1 root  wheel   994596 Jul  4 10:05 git-upload-pack
-rwxr-xr-x    1 root  wheel   273636 Jul  4 10:05 gitk

I'm using version v1.5.6.1-204-g6991357. This is not the final 1.6 release (the next one), so you might see further commands added (git-shell might join this list).

At least for me, this required some training because I liked to git-TAB to complete...

x-git-update-to-latest-version

This morning in the git mailing list, I wrote a small shell recipe to update your git to the latest version but keeping the previous ones around, in case something goes wrong.

I noticed that what I wrote was a lot better than the hack-and-slash script I was using, so I promoted a cleaned up version of that to my scripts stash. Hence, you can now download x-git-update-to-latest-version and enjoy painless git updates.

You need to tweak two things at the top of the script:

  • the location on your hard drive of the git.git clone (you need to create that with git clone git://git.kernel.org/pub/scm/git/git.git first);
  • the base directory where all the git versions will live.

You'll end up with something like this (I use /usr/local as my base directory):

melo@MrTray:melo $ ls -dla /usr/local/git*
lrwxr-xr-x   1 root  wheel   25 Jul  4 10:05 /usr/local/git -> git-v1.5.6.1-204-g6991357
drwxr-xr-x   7 root  wheel  238 Jul  4 10:05 /usr/local/git-v1.5.6.1-204-g6991357

Each version will be named git-VERSION where VERSION is the output of git-describe. A symbolic link named git points to the latest version.

Just add BASEDIR/git/bin to your PATH and your are done.

Small tweak to my mail setup

I try to keep my inbox empty, but due to a lack of a task manager that I can feel good about, I don't have a place to put pending tasks.

So sometimes I leave them on my inbox. Not good.

Until I find a good system that I like to use, to keep my tasks and projects, I made a small adjustment to my mail habits. I already have an extensive list of rules to filter mailing lists and other regular emails to proper folders.

What I did now was this:

  • created an Interesting Folders smart folder: basically, it joins together the folders (either mailing lists, or my regular inbox) that I want to keep a close eye on;
  • created a Quick Read smart folder: simple condition - show everything from Interesting Folders that is not read.

So each time I open up my email client, I just look at the Quick Read folder. It shows only new mail messages that I haven't read yet. The thing I like the most is that if I need to do something later, I can just leave it there and it will disappear on my next check, but still be safely stored on the original folder.

It's not perfect by any means, but its working very well for me.

July 02, 2008

MySQL advice

When people ask me what MySQL to use, I used to respond "Go to http://dev.mysql.com/ and download the community edition". I recommend it over any version of MySQL that is bundled with your OS.

But I also listen to people who know more than me when it comes to MySQL, and one of those just asked (and presented facts) if the those binaries are in fact dead.

So right now, my new advice on MySQL choices is this: read the latest MySQL Performance blog article and decide what you want.

I'm going to test the Jeremy Cole's releases of MySQL to see if they work correctly with my environment, and I'll keep you posted.

One thing will still send me to dev.mysql.com, though: Mac OS X binaries. Jeremy only has RedHat Enterprise Linux RPMs Update: I was wrong. At least the Enterprise version has Mac OS X binaries. Thanks to Joaquim Carvalho for pointing them out for me.

June 30, 2008

Behind the scenes of Tarpipe XMPP

Alex and Adam asked me to elaborate on the tools I used to implement the Tarpipe XMPP gateway.

I used the Net::XMPP2 Perl module, in particular the Net::XMPP2::Component class. It uses the AnyEvent async framework, which in turn support the EV library , giving you all the love of kernel polling.

Until the Net::XMPP2 author releases a new version, you should use my own copy if you plan on doing external component work (check the component-reply-with-from branch). Net::XMPP2 is widely used for bots, but not that many components, and some methods need some love to work properly in a component environment.

The HTTP part was done using the excellent AnyEvent::HTTP class. I'm using my own version, which includes a bug fix to the http_post function (on its way to release 1.03 of AnyEvent::HTTP) and adds support for HTTP::Request to http_request. I hope to see this included in the main AnyEvent::HTTP distribution but I still need to update the documentation.

The rest is just glue code.

June 28, 2008

*jour

On the topic of Bonjour goodies, take a moment to read about, and install, some of the *jour tools.

Very cool stuff.

Bounjour, CPAN!

For a Perl programmer, a local (on your laptop) CPAN mirror is a worthy investment. The problem is that a full mirror is 5.8Gb of disk space. Fortunately we have CPAN::Mini that creates a mirror of the most important stuff using only 830Mb.

So now you have your local mirror, and after you add the path to your cpan urllist configuration, all your module installations will use this faster mirror.

But you shouldn't stop there. If you are using a Mac with 10.4.x or above, you can share you CPAN mirror with the others on your local lan, and announce it proudly using Bonjour.

To do that, just follow these steps. First create a Apache configuration file at /etc/httpd/users/. I called mine cpan.conf and it looks like this:

#
# My local mini CPAN mirror
#

Alias /cpan/ /Users/melo/Documents/cpan/
<Directory "/Users/melo/Documents/cpan/">
    Options none
    AllowOverride none
    Order allow,deny
    Allow from all
</Directory>

RegisterResource "Local CPAN Mirror" /cpan/ 80

This will share your CPAN mirror (change /Users/melo/Documents/cpan/ to the path of your local mirror) and announce it via Bounjour with the name "Local CPAN Mirror".

Make sure that you start Web Sharing, at System Preferences, Sharing. If yes, stop and then start to load the new configuration file.

To use this CPAN mirror from other computers, you just start cpan, and then type o conf urllist URL where URL is the URL advertised.

Reconnoiter

Something to pay attention to, Reconnoiter.

Its still work in progress but in the last status report, Theo Schlossnagle mentions that he is already using this software to monitor about 2.9k services with less than 0.10 load, peek.

That's impressive.

June 27, 2008

MySQL master/master how-to

New location: the How-to is now live at http://www.simplicidade.org/how-to/mysql-master-master/.

In the past, I needed to use a MySQL master/master setup for a client. At the time, I got it to work, but forgot to take notes about the process.

In the last days, a friend asked my for help to do a setup like that again, to use with a Tigase XMPP server.

So I wrote this how-to, complete with configuration files, sample data and schemas, that walks you step-by-step through the process of setting up a pair MySQL servers in a master/master configuration.

You can fetch the entire how-to with git:

git clone git://github.com/melo/mysql-master-master-how-to.git

The how-to is written in MultiMarkDown, but a simple Markdown should also format it correctly.

A tarball with everything is also available.

Please let me know if you find any mistakes. Pull requests are the preferred form of collaboration :).

Update: included some tweaks from dbr, and added a HTML version with proper CSS. Download again, and look at how-to.html.

June 25, 2008

Git starters

In the last two or three weeks, I had about five or six people IM'ing me "What's the best way to start with git?". Last time I wrote about this was almost 7 months ago, so I think its time for an update.

First, I personally compile git from source. I do not trust package maintainers yet. Some parts of git use ssh (for example), so they mark ssh as a dependency, and this brings their own version of ssh. Most often that not, I ended up with a broken ssh on my system. So, compile from source is my personal recommendation for now, until package managers understand system-ssh.

Second, git is still being developed at a fast pace. I recompile it once a month at least, and I follow the master branch. That is, after I compiled the latest stable release from a tarball, I clone the git repo and I install the master branch version. Usually a make distclean && autoconf && make && sudo make install does it for me.

You will be happy with the official releases. There was a time that you had to wait a lot between releases, but now, they are coming out at a fairly decent pace. So sticking with the stable tarballs is a good enough option.

Regarding actually using git, I have three recommendations:

If you like dead-tree-stuff like me, watch out for the Pragmatic Version Control Using Git book (now in beta, PDF available), by Travis Swicegood.

For hosting your git repos, you can't go wrong with either Github or Gitorious. I'm using the fist one because I like all the user interface niceties, but the second is also available as GPL project that you can use on your own site.

June 16, 2008

How projects use git: Buildr

The best articles about distributed version control systems for me, are not those who explain the internals and the user interface, but how specific projects use those tools to get things done.

I've come across posts about several projects and how they use git and mercurial, and I'm going to start publish them here.

The first one is a couple of months old but still good: Buildr developers and how they where using a git-svn mirror.

Have fun.

June 04, 2008

Simple Boujour-enabled command line copy/paste service

Sometimes I need to send someone a bit of code in the local LAN, or even to myself on the second Mac.

It would be nice to:

pbpaste | publish_local_lan

And on the other side:

receive_local_lan from_melo | pbcopy

There are a lot of tools like that the run under Mac OS X, but today I found one that can be run from the command-line: pastejour.

Installation is trivial:

sudo gem install dnssd
sudo gem install jbarnette-pastejour --source=http://gems.github.com

And after that, just use it. It will use your short name as the broadcast key. A sample session:

melo@MrTray:melo $ date | pastejour 
Pasting melo...

# in another computer

melo@Screen:melo $ pastejour melo
(melo from MrTray.local.:42424)
Wed Jun  4 10:40:54 WEST 2008

Simple and effective.

June 03, 2008

RailsConf 2008 Git Tech Talk

Scott Chancon did a huge presentation (523 slides...), Getting Git, at RailConf 2008. Don't be scared by the number of slides, the presentation is excellent and you'll end up with a huge knowledge of git.

Extremely recommended.

June 02, 2008

Easy Git collaboration in local LANs

Doing quick hack sessions in a local LAN with friends using git just got a lot easier.

Evan announced gitjour, a Bonjour-enabled Git server. You can start a server for any repository on your laptop/workstation and others can browse the available repositories and easily clone them.

There is already a lot of work going around gitjour by a couple of developers, so it has a nice future ahead.

Nice...

May 30, 2008

Making better use of the XMPP presence in Bots

The current crop of the XMPP-based bots is pretty basic. They provide an online presence in the XMPP network, and you interact with them by a command-line-style interface.

These are some suggestions that bot authors can use to increase your XMPP presence easily. I'm going to stick to easy stuff only, most of it widely deployed.

The first thing that you need to understand is that you can have different per-contact presence information. You can switch the status message and the avatar for each of your contacts. Its called directed presence (basically include the contact JID in the to attribute, but you can always check the spec for directed presence).

You can use this to provide a personalized status message for each person on your roster. If your service has a notion of context, you can use the status bar to show the current context. For example, imagine that you have a ticketing system bot. You type "working on TICKETID", and your bot can change the status message to "Working on: TICKETID - TICKET_TITLE (URL of ticket)".

Optionally you could update the status each five minute with the elapsed time since you started working on it.

You can also switch the avatar. Maybe switch to a red background, meaning that you are busy doing something, or putting it in another way, the bot is telling you that he thinks you are working on something.

Other stuff you should do is to provide an alternative representation of the information you send textually. Twitter (when it used to work...) was a great example of this: every tweet you received by IM, included an Atom entry with the most important information of the tweet. This allows client to improve their interaction with such services, by using a more appealing interface.

But Atom entries are not yet standardized by the XMPP Foundation (there is some effort on Atom over pubsub that could be used as an example though). On the other hand, Data Forms are a standard, and more, they provide interaction possibilities.

You could send a "New ticket" notification, including a data form with the structured information, and a select-box with the possible next steps. The user would see the form, and submit back their decision.

Still in the subject of messages, you should know that there are several types of messages. You have chat, headline and normal (strangely enough the default). There are two more but not important in this context. You should use the correct one depending on the type of message you are sending.

If your system is just sending notification, please use the headline type. It makes it clear to clients that they are not chatting with you, just notifying of something that happened. For example, a smart client can react to those messages differently based on your status: in away, xa (extended away) or dnd, a client can keep a log of new headlines and show it like a email client when you return, without interrupting your workflow, probably using a badge with "You have X pending headlines".

The final recommendation: respect the user status. If the user is dnd, are you sure you should send him anything? Why not change the status message to "You have 5 messages pending. Send 'pending' or go to URL to see them"? Maybe even change the avatar color. You should not disturb someone in dnd. I would also say that away and xa are off limits but that's just me.

On the other hand, you should send "Hey there, welcome back! You have 5 pending messages. Send 'pending' to see them or hop on to 'URL'." when the presence changes to chat or available (not a real <show> value, just lack of <show> value. See the valid values of <show> and their meanings in the spec).

Have fun implementing your bots.

May 28, 2008

Fluid.app + Google Gears = Sweet!

This is great news: Todd Ditchendorf just released a nightly build (link removed) of Fluid.app that includes Google Gears.

I'm a big fan of Google Gears, and Fluid is getting a lot of love regarding integration with the desktop.

BTW, this is a good reason for me to upgrade to Leopard.

update: Todd giveth, Todd taketh away.

TextMate: a new take on Search with Ack

Apparently Trevor Squires found my Search in Project With ack TextMate command at the time it was 404, so he wrote is own.

I haven't tried it yet, but from his description its a lot better than mine.

May 27, 2008

Dan Geer at Source Boston 2008

Watch the video of Dan Geer at Source Boston 2008. Worth every minute of it.

May 26, 2008

Textmate: Search in Project with ack

Just a quick note to point out that my "Search in Project with ack" command for Textmate was updated.

May 14, 2008

My keyboard shortcuts

Inside the "Keyboard & Mouse" preference pane of Mac OS X, you'll find a tab named "Keyboard Shortcuts". This is one of my first stops after any nuke and pave setup of my Macs.

I don't have many shortcuts:

My Keyboard Shortcuts

The Take Rich Note and Append Rich Note integrate all applications with DevonThink. I select what I want to keep, and hit the proper sequence to save it in my default database.

The Quit Safari shortcut prevents me from closing Safari by accident. Its specially useful if you have a lot of tabs open. Recently Safari gained options, like the Reopen All Windows From Last Session, that make this less useful but I still use this.

The Select Next Tab and Select Previous Tab work around the fact that the default shortcuts in Safari for those options just don't work with some international keyboards (like my own, PT-layout).

The final shortcut is something new, that I'm trying out. It gives you a global shortcut to Zoom any window. Not sure if it's a keeper.

April 30, 2008

$science++

Got to love science:

Men who said they had sex twice a week had a risk of dying half that of the less passionate participants who said they had sex once a month, Dr. Davey-Smith’s team said.

via Justin Mason.

So, print a couple of copies of the study before you leave for the pub/bar tonight. You got science on your side now.

April 24, 2008

MySQL optimization quick tips

I'm not a expert on MySQL, but I spent a better part of this past night optimizing a server, and I've collected some notes. This is mostly targeted at InnoDB-based tables.

To change this settings, you should edit your my.cnf and update them in the mysqld section. See the MySQL System Variables manual page for more information. Please remember to keep a good working backup of your previous my.cnf. Better yet, include your my.cnf in your source control tree. I assume that you already have tested backups of your data...

So, do this steps:

These are the basic settings that I pay attention to. Of course there is much much more settings that you can tune (transaction-isolation and innodb_flush_log_at_trx_commit come to mind), but the things above will cover most of what you need.

After this, you should also use the mysqlreport tool. It requires more investment on your part, understanding the data, but its very thorough.

Finally, install Maatkit and get used to the tools it provides. They are essential if you are using replication.

April 17, 2008

Safari 3.1.1

In case you missed it, check your Software Update, there is a important Safari upgrade in there.

I classify this as "important" not because of the security fixes included, but because Mail.app is now working properly again.

With the last Safari 3.1, I stopped being able to paste plain text inside a mail message. I would copy something from TextMate or a Terminal window, and when I pasted it into a Mail.app message window, the line breaks would be lost.

This is most likely related to the same bug that Gruber complained about.

Anyway, Safari 3.1.1 fixes it, so I would strongly recommend that you upgrade to it.

Network throughput "problems"

A friend of mine was complaining that he could only upload a file to my server at 2Mbytes/sec with his FIOS link at home. Some people have interesting "problems".

Anyway, I send him my /etc/sysctl.conf that I have on my Mac for quite some time without any problems, and with significant gains in network performance on my local LAN (specially for other computers with Gigabit ethernet).

net.inet.tcp.mssdflt=1440
kern.ipc.maxsockbuf=800000
net.inet.tcp.sendspace=400000
net.inet.tcp.recvspace=400000

Stick those lines in your /etc/sysctl.conf and they will be active every time your Mac reboots. Make sure you place them on all your Macs. To activate them right now, run this:

sudo sysctl -w net.inet.tcp.mssdflt=1440
sudo sysctl -w kern.ipc.maxsockbuf=800000
sudo sysctl -w net.inet.tcp.sendspace=400000
sudo sysctl -w net.inet.tcp.recvspace=400000

You should see network a better network throughput immediately.

There is a fifth setting:

net.inet.tcp.sockthreshold=0

but I admit not to fully understand the implications of his use. Apparently without it, the sendspace/recvspace settings would not work, but my tests tell me the oposite. I'll update this when I do know if/when to use the one.

For reference, a local FTP transfer to a 100Mbit server was not using all of the bandwidth. With just the first line, I was able to saturate the server link. With all the lines, over a gigabit ethernet link, I was able to reach 35Mbytes/sec between two Macs. Without: a meager 6Mbyte/sec.

All the gory details are available in a study by the Pittsburgh Computing Center.

Update: sorry, miss-typed the values, remove the extra zero.

April 11, 2008

GitHub network view

Speaking of GitHub, they are now open for business.

With the official launch they enabled a couple of features that had been discussed in the past: commit comments (I still don't know if I like them or not), and integration with Campfire and Lighthouse (although you can also roll-your-own integration using their hooks).

One feature though, was not on the roadmap, and its a beauty: an network graph visualizer.

Its a great view across several repos, and it works very well. As far as I know there is no tool on the desktop that gives the same view. Not even gitk.

Most excellent.

Cairo with Quartz support

The 1.6.0 release of Cairo moved the Quartz back-end from experimental to supported.

This is very good news for me because some nice Perl modules for charts use Cairo.

If you are using FF3 betas on a Mac, you're already using Cairo by the way.

Update: an article from one of the authors.

April 01, 2008

Binary XMPP

Last February, Google's Android Team shocked the XMPP world by having the gall of saying that they where moving to a Binary encoding of their GTalk API in the M5 release.

Of course this sent shock waves through the XMPP community, not because they where totally wrong, but because they didn't have the vision to solve it.

We of course understand that they have their hands full (of vapor, some bad mouths would say...) getting the platform in shape for the late 2008 release, and the XMPP.. err, sorry, GTalk protocol is not a top priority.

Well, today Android users around the world can rejoice because once again, the XMPP Standards Foundation has risen to the challenge and unveils to the world the solution to the "verboseness" problem, by introducing the breakthrough Binary XMPP protocol.

This has been in development for quite some time, taking long hours of (sometimes) heated discussion. The goal is to have something that nobody could ever accuse of being verbose (the symbol count is extremely low, and the meaning is clear from the start), but at the same time remain compressible for those extreme cases where bandwidth is at a premium (we tested, and we have code to prove it: a Binary XMPP stream can be compressed to 2% of its original size). We know of clients that live in five-sides-shaped-buildings that, with their hard 9600 baud limits, will be the first to use this.

Its been a great ride, and it feels me with immense joy to be able to put my small signature as a co-author of this spec. Of course, my part is extremely small when you have the documentation genius of Peter Saint-Andre, and Fabio Forno strict guidelines and requirements, as co-authors. Its been a pleasure to work with both on this. And I should also thanks the important Kevin Smith contributions.

And now time to rest. SRV records are up at simplicidade.org domain, and Fabio's CM is running also.

PS: and yes, now you know why the prolonged silence around here.
PPS: for those who have moved on, the code is also available on GitHub.

March 26, 2008

MySQL errno 150 - ERROR 1025

I get the 150 errno a lot when upgrading schemas. Usually this is a indication that the action I was trying to perform is not possible given the current foreign key constraints in place.

Last night I got a variation on the theme:

ERROR 1025 (HY000): Error on rename of './e3/#sql-17f3_f894' to './e3/cls_ev_items' (errno: 150)

The table was this:

CREATE TABLE `cls_ev_items` (
  `id` int(11) NOT NULL auto_increment,
  `evaluation_id` int(11) NOT NULL,
  `criteria_id` int(11) NOT NULL,
  `value` varchar(50) default NULL,
  `modified_at` datetime NOT NULL,
  `rank` int(11) NOT NULL default '0',

  PRIMARY KEY  (`id`),
  UNIQUE KEY `report` (`evaluation_id`,`criteria_id`),
  KEY `criteria_id` (`criteria_id`),
  CONSTRAINT `cls_ev_items_fk_evaluation_id` FOREIGN KEY (`evaluation_id`)
      REFERENCES `cls_evaluations` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `cls_ev_items_fk_criteria_id` FOREIGN KEY (`criteria_id`)
      REFERENCES `cls_ev_criteria` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8

I was trying to drop the report index to replace it with a new one.

ALTER TABLE cls_ev_items DROP INDEX `report`

The error message is not clear enough, but in this case, the problem is that each CONSTRAINT requires an index on the foreign key field. If MySQL allowed the removal of the report index, he had no way to efficiently check the cls_ev_items_fk_evaluation_id constraint.

The best solution I could come up with is a temporary index on evaluation_id. You are then free to mess with the report index.

After you finish, if the new report index begins with the evaluation_id field, you can drop the temporary index. And everything is back to normal.

It clear the MySQL checks this chaining of indexes and constraints, so I hope to see better error messages in the future.

Update: the error log gives a bit more info, btw:

080326 10:51:32  InnoDB: Error: in ALTER TABLE `e3/cls_ev_items`
InnoDB: has or is referenced in foreign key constraints
InnoDB: which are not compatible with the new table definition.

So in case of a 150 error, check the error log for better pointers.

Update 2: as pointed out by NiN in the comments, run SHOW INNODB STATUS\G (the \G will make the report easier to read) and look for a section labeled LATEST FOREIGN KEY ERROR. The message is pretty good there.

March 20, 2008

Dear lazyweb: CSS question

This is something that I want to do for quite some time, but I haven't figure it out how.

If I have a HTML table styled with width: 80% but the total size of the content on each <td> is smaller than the 80% size, then the browser will try to make all the columns the same width, and add extra spaces at the right side of the cell.

What I wanted is to specify that some columns should shrink to the smallest possible size without causing line breaks in the content.

For example in an invoice with 4 fields, item description, quantity, value and sub-total, the default layout would be something like this:

| item        |         10 |         100 |        1000 |

and I wanted this:

| item                               | 10 | 100 | 1000 |

I wonder if there is some CSS combination that does this, that I can apply to a <colgroup> to say "give me the minimal possible width for these columns.

Any tips?

March 13, 2008

Link files in Catalyst error messages to Textmate

When you develop with Catalyst, if you have an error condition, you get a pretty interface with access to all the major objects in the request.

At the top, Catalyst will place the classical perl error message like "Caught exception in MODULE, at FILE line LINE."

This hack takes that classical format and links the "FILE line LINE" with a txmt: link. If clicked, it will open directly into your project in TextMate.

The code is simple. Stick this into your main application class:

sub finalize_error {
  my $c = shift;

  $c->NEXT::finalize_error(@_);
  return unless $c->debug;

  my $error_msg = $c->response->output;
  return unless $error_msg;

  $error_msg =~ s{(\s+at\s+)([\/]\S+)\s+line\s+(\d+)}
                 {"$1<a href='"._mk_textmate_link($2, $3)."'>$2 line $3</a>"}ge;
  $c->response->body($error_msg);
}

use Cwd qw( abs_path );
sub _mk_textmate_link {
  my ($file, $line) = @_;

  my $abs_file = abs_path($file);
  return "txmt://open/?url=file://$abs_file&line=$line";
}

It works for me so far. If this breaks anything for you, you get to keep both parts.

Here is a sample of the output with this hack applied (click for bigger version):

Safariscreencapture004

Update: a new version. The big change is the use of the abs_path method to make sure you get the absolute path. This solves problems that I was having with symbolic links. TextMate was opening a new window, because the project and the file path in the error message had different prefixes.

March 07, 2008

perl warnings

You have to learn to ignore the forrest.

There are some perl warnings that hide the real problem. My most hated perl warning is this, the first three lines below:

"my" variable @prob masks earlier declaration in same scope at sbin/some_script line 1640.
"my" variable $count masks earlier declaration in same scope at sbin/some_script line 1641.
"my" variable $t masks earlier declaration in same scope at sbin/some_script line 1642.
syntax error at sbin/some_script line 1504, near "next "

Those lines are there because the parser had to bail out after detecting the error on line 4, and failed to notice the end of scopes.

This could be less of a problem if the warnings and errors where ordered by line number, but they are not. So learn to look at the line numbers first to decide which warning to pay attention to.

Ultimate Game

A most excellent Ultimate Game strip at xkcd.

By the way, in case you haven't figure it out yet, half the fun of xkcd is usually buried in a alt/title tag on the image, so always hover over the image to see it.

March 06, 2008

Search in Project with Ack command for Textmate

One of the biggest problems I have with TextMate and large projects is "Find in Project":

  • it is a bit slow;
  • it searches everywhere, even files it shouldn't.

There is a nice alternative that uses grep by Henrik Nyh. But I'm a big ack fan, so I hacked Henrik command to use ack, and the result is Search in Project with ack-command for TextMate.

A future version might move to tm_dialog. Maybe I'll copy the GrepInProject++ as the basis for the next version.

Update: ok, this has been working better than expected. The difference in speed makes me use it without thinking about it. Also, it does not block TextMate. So for really huge projects, you can start it and keep working on your code.

Update 2: oops, a disk problem and the link went 404 on me. I've updated the code to the most recent version I'm using. The uuid of the command is changed so you might need to remove the old version before updating.

March 04, 2008

git hosting sites

A couple of weeks back I started looking for Git hosting sites. My two picks where Gitorious and Github.

At the time, and in the weeks that followed, I was too busy to start comparing the sites. But this week I want to start pushing my code out to these sites to see how they work.

In terms of feature list, Github received a lot of love in the past weeks, as you can read from their new blog. But Gitorious is not standing still either. Both are working to provide an easier collaboration feature around git pull which is very cool.

Right now, based solely on the feature list, I think Github has a slight advantage. If I had to point one Github feature that stands out, I would have to pick a Web Hooks-powered Post-Receive hook.

update: arrgh, I'm a moron. I forgot to add my Github password to my Keychain, so I'm locked out of my account. There is no "recover password" functionality on Github yet (its on the Feature Requests page though) so I sent an email to their support account, and I'm hoping for the best.

update 2: Ok, access to Github restored, big thanks to Tom Werner. My profile page is at http://github.com/melo/. I have 5 invites now, so if you want one, send me an email or ping me via XMPP/Jabber. My address for both is melo @ simplicidade.org.

February 22, 2008

todo.pl

I'm trying Hiveminder again.

My first attempt some months ago was not successful, and I must admit that although the new interface (specially the IM-based one) is very easy and fast to use, I still see the lack of offline-mode the biggest barrier to use this type of services.

Be sure to check pjf presentation Effective Procrastination with HiveMinder, its really good.

One thing that I was having problems just now was the todo.pl command line interface. Apparently, it doesn't like characters with accents. I looked at the code and it seemed to deal with them fine, figuring out the encoding of the terminal and using binmode to set the encoding filter of STDOUT to the proper charset.

But still, I was getting this errors: "\x{00e3}" does not map to ascii at /Users/melo/bin/todo.pl line 276.

Part of the problem is that terminal encoding detection fails on Mac OS X terminal. The problem seems to be inside I18N::Langinfo, because he reports us-ascii, but thats a XS module, so it might take a while for me to figure it out.

In the meantime I added a encoding setting to the todo.pl program, to force a specific encoding. Apply this short patch to todo.pl, edit ~/.hiveminder and add a line encoding: utf-8 and all should be well.

I don't think this patch is worth pushing upstream right now. The correct fix is in I18N::Langinfo, but I needed to get some work done today.

Update: fixed big 404. Sorry bout that.

February 19, 2008

Working with Wikipedia data

The good people at Freebase launched WEX, a Wikipedia Extraction tool, and made their exported data downloadable.

This allows you to use our usual text, XML or even SQL processing tools to analyze Wikipedia entries.

I mention this because the FreeBase project is one of the most interesting thing happening in the online knowledge space. If you haven't noticed, this project is building a huge structure data repository, open and extremely linked, with strong meta-data.

You should check it out.

February 16, 2008

ezmlm survival guide

Every time I have to do something with ezmlm-idx, I end up reading a lot of documentation.

It really tells you something about the quality of ezmlm. The need to tweak it is rare, so I forget how it all works.

Anyway, this time I wrote a couple of notes, a small survival guide if you will.

Continue reading "ezmlm survival guide" »

February 15, 2008

Apple TV as a AirTunes destination

A quick tip that I just read, and I didn't see this mentioned anywhere else: the Apple TV, with the latest software update, is now a AirTunes destination.

This is very cool. I can send the music from my iTunes to the big speakers easily now.

February 14, 2008

High-performance SSH

A very interesting patch to the standard OpenSSH code base, to improve SSH throughput.

Something to try in a week or so.

There is also a TCP tuning article at the same site worth a read.

Operations Mantras

A big Operations Mantras article by Alan Kasindorf.

Loads of interesting stuff, but it does take a bit to digest.

February 13, 2008

git hosting

If you need to host your project somewhere using git, I suggest you try gitorious.org.

Setup was a breeze, painless and just worked. I was up and running in less than 5 or 10 minutes.

I tried repo.or.cz a couple of days ago, but I still can't push to a project I created there...

February 09, 2008

github.com

There is a new service starting up, github.com, that provides Git hosting including a very nice Web interface with some cool twists like LightHouse integration.

Another cool feature is the read-only Subversion repository of the git master branch.

Unfortunately the service is still in beta. I've subscribed for an account, I hope they open up more slots soon.

Sniffing browser history

Came across this article about using JavaScript to sniff your browser history.

Very interesting stuff. I wonder how long it will take to see a jQuery plugin for it?

(via 43 folders)

February 08, 2008

.bash_completion.d

André Cruz asked me why wasn't his ~/.bash_completion.d/ being used by default.

Well, its a feature of the bash_completion system that you must activate by hand. In my ~/.bashrc I have the following code:

bash=${BASH_VERSION%.*}; bmajor=${bash%.*}; bminor=${bash#*.}
if [ "$PS1" ] && [ $bmajor -eq 2 ] && [ $bminor '>' 04 ] ; then
  if [ -f ~/bin/bash_completion   ] ; then
    BASH_COMPLETION=~/bin/bash_completion
    BASH_COMPLETION_DIR=~/.bash_completion.d
    export BASH_COMPLETION BASH_COMPLETION_DIR
    . ~/bin/bash_completion
  fi
fi  
unset bash bmajor bminor

The trick is the BASH_COMPLETION_DIR setting before you source the bash_completion script. The BASH_COMPLETION environment is required because I keep a local copy of script.

I'm using the latest version of bash_completion, 20060301. If you download the 20060301 tarball, check the README file. There is a FAQ at the end that mentions this and other cool tricks.

February 07, 2008

git bash completion

If you download the git tarball, you'll find a git-completion.bash script that adds shell completion to your git day-to-day usage. Look for it in the contrib/completion/ directory. (update: if you can't find it there, look for it in the $sharedir/git-completion/ directory. A recent patch by Johannes Schindelin promoted git-completion.bash to the big leagues.)

When I started using git, I just sticked the script into my ~/.bash_completion.d/ and bash picked it up after a exec bash -login.

One thing I didn't do was read the file, and I missed some cool stuff. One of them is a PS1 hack to show the branch you are on:

4) Consider changing your PS1 to also show the current branch: PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '

The argument to __git_ps1 will be displayed only if you are currently in a git repository. The %s token will be the name of the current branch.

Be careful to use '' and not "" in the PS1= statement.

If you have several branches active at the same time, this really helps.

February 05, 2008

git gc --auto

For normal usage, git is written to be as fast as possible. That means that certain house-cleaning tasks are not done on every command and from time to time, you have to let the git gremlins cleanup the place.

For the last couple of git releases, the most common command to do that is to use git gc --auto. This will check the repository, see if it needs cleaning, and perform the necessary operations. Its a safe command, you can even start it, and keep on working on the same repository.

Personally, I run it when I remember or when git gui reminds me that I have to many loose objects.

In case you might be wondering the sort of impact this can have on your repo, I'll quote an extreme case. Notice that the user didn't use git gc --auto but choose a different, more aggressive, set of options. The tradeoff is execution time and you cannot use the repo while git gc runs.

I used git-svn to import a repository with 33000 revisions and about 7500 files. It took about 18 hours to import. When it was done, my .git folder had 242001 files that comprised 2.0GB. I ran git gc --agressive --prune and let that sit overnight (I wish it was more verbose, it went for over an hour without printing anything), and that managed to compress the repo down to 334 files and 64MB.

So from time to time, don't forget to git gc --auto.

January 11, 2008

PHP bashing and PHP prase

For my PHP friends out there:

Even if you are not a PHP user (left that wagon around PHP4), the setup wizard bundled with the Simplicity framework is very cool, an extjs-based application to create the application, including designing your database schema.

January 10, 2008

Network Mafia

In a stunning move, Network Solutions renamed it self as Network Mafia.

I wonder if they also do this if I query the whois database from another registrar website.

January 05, 2008

Regarding reverse proxys

Right now I use both Perlbal and Lighttpd as reverse proxy.

Lighttpd does a lot more than reverse proxy, and I also use it as a web-server for static content and mod_secdownload for certain content. But the current stable 1.4.x version is not that good reverse proxy. For example, it does not keep persistent connections to back-end servers, and sometimes he thinks all my application servers are dead when they are in perfect health.

Perlbal is a much much better reverse proxy. It keeps persistent connections up to back-end servers, and before using them for client requests, it makes sure the server is really ready to answer requests. It also caches file uploads to disk before allocating a back-end server to process the request. All this features keep the back-ends pretty busy and without stalls. But as a web server for static content, its just not as good as Lighttpd.

I've been working with three upgrade paths.

The first is use Perlbal for reverse proxy for the application servers and remap all static content to a different site. This also makes sense because if you move your static content to a CDN, your site will load a lot faster.

The second is using Lighttpd 1.5.x. The last time I tested it, it was still a bit unstable with my setup, so its in the back-burner for now.

The third is to use Varnish. Its wicked fast, extremely configurable, and it does caching. But it takes a bit to get into, so its a bigger time investment than the other two.

Right now, I think that Varnish would be the best bet of them all, but I don't have the time right now to get up to speed with it. So I'm going to use the first option, using Lighttpd 1.4.x to power my CDN server, just because I need modsecdownload.

Easy staging server setup

Before a new release of a web site, I like to give it a whirl using the production environment, including servers and database. To make the experience of the staging server as close as possible to the production server, I want to use the same hardware and server name for both.

At first I used different IPs, aliased to the same server, one for production and another for staging, and I used my local /etc/hosts file to switch between the two environments. This works but its a pain, because I have to restart the browser to pick up the new address, and also, I have to do this on every computer that wants to check the new version.

The new setup is much better.

It uses two application servers, production and staging, running on the same server, on different ports, and then use the front-end Lighttpd reverse proxy to select between the two instances based on a cookie.

The relevant part of the Lighttpd configuration is this:

# Staging server
$HTTP["cookie"] =~ "(; )?app_version=staging:" {
  proxy.server  = (
    "" => (
      ( "host" => "127.0.0.1", "port" => 7005 )
    )
  )
}

# Production server
$HTTP["cookie"] !~ "(; )?app_version=staging:" {
  proxy.server  = (
    "" => (
      ( "host" => "127.0.0.1", "port" => 7000 )
    )
  )
}

This setup uses a cookie named app_version. If app_version is set to staging, Lighttpd will use the application server at port 7005. If not, it will fallback to port 7000. This works even if we don't have any cookie.

Currently, only internal users use the staging server, so I have an option in the back-office site to select the version they wants to use.

You should notice that I don't do any security checks on the cookie, so anybody could set the cookie and use the staging version. To prevent this, my app_version cookie includes a SHA1 checksum of the content using a secret string (so its really app_version=staging:40-char-hex-sha1-checksum). Yet, I don't check the SHA1 in the front-end proxy, that would be wasteful of CPU for the vast majority of users using the production server. Instead, I have optional code to validate the checksum in the main application server. That code is only enabled on the staging version and displays an error message.

The final problem I needed to solve is identification of the current server. People got confused, they didn't had a clear indication if they where using the staging or production server. To solve that, I just enable a small HTML '<div />' and appropriate CSS in the staging server, and this will float a nice "You are using the staging server. Switch to Production"-message. The "Switch to Production" part is a link to the back-office tool that removes the cookie.

With these things in place, I can push release candidates to staging without worrying to much about it, have the other people at the office try it out, and then if all is ok, run a script on the main server and deploy the staging version to production.

Update: this also helps for rolling updates. For example, you could use this setup to keep two production versions available, and use the login process to move users from the old version to the new one.

December 21, 2007

Compiling git on Mac OS X

I had to install git on my temporary G4, and I didn't kept proper notes before, so for future reference, here is my way of compiling git.

This was tested on 10.4.11. I haven't upgraded to 10.5.x yet, maybe when 10.5.2 comes out.

I don't use MacPorts or fink, I don't like them. Its a personal thing, and if you do like them, you might as well use them instead.

First, let me tell you that I keep several git versions around. I install them in /usr/local/git-VERSION and switch a symbolic link at /usr/local/git to point to the version I want. Also, when I compile directly from a git checkout, I use the SHA1 of the HEAD as my VERSION. My PATH includes /usr/local/git/bin to find the latest version.

Second, I haven't bothered yet to compile asciidoc, so I don't compile/install the documentation.

Compiling the released version

Download the latest version of git from the git home-page.

mkdir ~/src
cd ~/src
GIT_VERSION=1.5.3.7
curl -O http://kernel.org/pub/software/scm/git/git-${GIT_VERSION}.tar.gz
tar zxf git-${GIT_VERSION}.tar.gz
cd git-${GIT_VERSION}
./configure --prefix=/usr/local/git-${GIT_VERSION}
make all

If you get a compile error, something like:

    SUBDIR git-gui
    MSGFMT    po/de.msg make[1]: *** [po/de.msg] Error 127
make: *** [all] Error 2

It means that you would need to compile GNU gettext to get msgfmt. Just do:

export NO_MSGFMT=1
make all

and it will use a alternative TCL script instead. According to pfig, 10.5.x does not need this export at all.

To install:

sudo make install

Now, create a symlink:

sudo ln -s /usr/local/git-${GIT_VERSION} /usr/local/git

And add /usr/local/git/bin to your PATH:

echo 'export PATH=$PATH:/usr/local/git/bin' >> ~/.bashrc

Make sure you have the new PATH:

exec bash --login
git --version

Now go play with it.

Living on the edge

But I don't stop here. The stock version of git is fine, but I usually use the master branch. So after you gone through all that, you do it again.

I keep a clone of the git master branch and update it from time to time. To clone the first time, do:

mkdir ~/projects
cd ~/projects
git clone git://git.kernel.org/pub/scm/git/git.git

If your firewall doesn't allow outgoing TCP on port 9418, try:

git clone http://www.kernel.org/pub/scm/git/git.git

You only need to do this steps once. From then on, whenever you want to update to the latest git, do:

cd ~/projects/git
git-pull
make distclean
GIT_VERSION=`git-show | head -1 | cut -c8-`
# I should be able to do GIT_VERSION=`git-show --pretty=format:"%H"`
# but I could not get it to work...
autoconf
./configure --prefix=/usr/local/git-${GIT_VERSION}
export NO_MSGFMT=1
# See above why export NO_MSGFMT=1
# You can try without it and use it only if it fails...
make all
sudo make install
sudo rm -f /usr/local/git
sudo ln -s /usr/local/git-${GIT_VERSION} /usr/local/git
git --version

That's it. You should be using the latest git available now.

December 20, 2007

cpan tricks

Our beloved cpan command line has some tricks up his sleave. In case you haven't read the fine CPAN manual in a while, let me point out some features I'm using right now to install all the needed modules for my day-to-day operation.

cpan .

You have a local directory with a module already unpacked, or your own personal module. The usual way to install them is doing the dance:

perl Makefile.PL
(manually deal with missing dependencies here)
make
make test
make install

The second step, the missing dependencies part, is the not so good part of the whole experience. Module::Build authors and users would suggest that the first step is the really not so good part, but I digress.

A better way is to do:

cpan .

This will start a CPAN shell, and run the install process on the local directory, including fetching dependencies from you preferred CPAN site.

failed command

Inside the shell, after you installed a long list of modules, the failed command will list all the modules that failed the tests and did not install.

o conf init /REGEXP/

The command o conf init will go through the entire configuration process. In recent versions it has become a long process and one of the things I tweak from time to time, the URL list of CPAN mirrors, is the last item asked.

To speed up the process you can o conf init /urllist/ and only configuration options matching urllist will be asked.

Don't forget to o conf commit at the end.

The smart CPAN urllist

The urllist parameter lists the CPAN mirror sites that the shell will try to use to fetch the packages.

My favorite site is a local CPAN::Mini mirror, the best 750Mb used space I have on my systems. It allows me to install any module from CPAN even when I'm offline.

But this requires that I keep updating the mirror, and sometimes, I just forget. And I don't want to cron it because I don't want to run it while I'm connected via UMTS.

A trick here is to set urllist to several CPAN mirrors in your country and include your local CPAN::Mini mirror at the end with a file: URL.

CPAN is smart and will fetch the CPAN indexes from one of the sites (see o conf init /random/