« June 2006 | Main | August 2006 »

July 31, 2006

Building Scalable Web Sites

I've been reading one of O'Reilly latest books, Building Scalable Web Sites, by Cal Henderson, of Flickr-fame.

If you do websites for a living, and better yet, if you do medium-sizes one, this is a book to read. It packs an amazing amount of info, covering topics from development environments, i10n, L8n, and data integrity, up to bottlenecks, scaling web applications and statistics, monitoring and alerting.

I'm picking up a lot of good tidbits and validation. At work, we where naturally moving to a deploy process very similar to the one described in this book. The use of SCM systems, and the integration of those with a ticketing system and Wiki are also gaining grown there.

I'm at the end of chapter 6, about Email, but I did a fast read over the entire book first.

Definitively, highly recommended.

July 27, 2006

Not fast enough

Sometimes, you're just not fast enough.

Last week I was talking with Joel Bernstein about the best way to host web applications written in Perl, for example using Catalyst.

My current setup uses Perlbal as a front-end, and then Apache+mod_perl in the back-end. I've also tried lighttpd with FastCGI, but I was not at all happy with the overall results.

Our first idea was to strip down Perlbal, plug in a FastCGI server, and let it talk to a bunch of FastCGI instances started using FastCGI::ProcessManager.

I looked at the code of Perlbal last week, and it has all the HTTP bits pretty much glued into the core, so supporting a new protocol seemed a major re-factoring job.

With that I turned to a second plan: implement with Danga::Socket, the fast socket framework that uses epoll/kqueue, a small HTTP server that powers our application, and keep using Perlbal for balancing between back-ends. That was my plan at the end of that night, and I started pulling stuff from Perlbal to do it.

It's not done yet, and now I think it might never be, because Matts just beat me to it.

There is no code yet, and while my effort was more about a simple HTTP server just for running Web applications, his seems to be a more fully featured Perl-based HTTP server.

I'll try and hop on #axkit-dahut to know when the code will be available to see if I can write plugins to do what I want.

Metaverse continued

From all the Metaverse-style worlds out there, SecondLife is my favorite. I tested it once, and now I just try to stay away, it's addictive.

Today, I came across a SecondLife Amazon integration that's just too cool not to talk about. There is a store inside SecondLife, called Life2Life, that uses the Amazon Webservices to show your books, magazines and even perform searches.

Every day goes by, and Snowcrash feels more real.

OSCON 2006

OSCON is well underway and the cool presentations are starting to appear online. I like to read them and collect the ones I find more important to me. As a side effect, I'll keep this article updated with all the links collected.

  • RHOX, by Audrey Tang: if you don't like Perl because the syntax scares you, you haven't seen nothing yet. If you like Perl because the syntax fits your brain, prepare for brain surgery;
  • Plagger, by Tatsuhiko Miyagawa: Plagger rocks, I've been using it, but the Notify::Pizza is too cool - whenever Miyagawa writes "I'm hungry" on his blog, Plaggers orders him Pizza using a WebService...;
  • ppencode, by TAKESAKO Yoshinori: Fun, Fun, Fun! It will be a lightning talk, presented by Audrey Tang;
  • Big Bad PostgreSQL by Theo Schlossnagle: I'm more MySQL oriented, but I like to be aware of big PostgreSQL implementations like the one described on this slides (the largest table as over 1,700,000,000 rows);
  • JavaScript Bootcamp by Amy Hoy: very nice get-up-to-speed with Javascript tutorial.

July 15, 2006

todo.sh

I'm trying to use the todo.sh organizer to sort some of my work. Usually I use small 3 by 4 cards, and they are great to collect stuff to do, but they are not that great when I need to organize them into projects and define priorities.

First impressions are good. It's a simple script with a simple command line interface.

One of the first things I did was adding a short alias, t, as suggested at their site. This cuts down a lot of typing but being a lazy bastard myself, I wanted more.

So I wrote this:

# todo.sh completion by Pedro Melo <melo@simplicidade.org>
# 
# for updates see: http://www.simplicidade.org/notes/archives/2006/07/todosh.html

_todo_sh()
{
  local cur prev commands options command

  TODOSHRC=${TODOSHRC:-${HOME}/.todo}
  if [ -r $TODOSHRC ] ; then
    . $TODOSHRC
  fi

  if [ ! -r $TODO_FILE ] ; then
    echo "ERROR: cannot read todo.txt file."
    echo "Make sure TODOSHRC is set with the correct .todo config file";
    return 0
  fi

  COMPREPLY=()
  cur=${COMP_WORDS[COMP_CWORD]}
  prev=${COMP_WORDS[COMP_CWORD-1]}

  commands='add append archive contexts del do list listpri \
            prepend pri projects replace remdup report'
  options="-d -p -q -v"

  if [[ $COMP_CWORD -eq 1 ]] ; then
    if [[ ${cur} == -* ]] ; then
     COMPREPLY=( $( compgen -W "$options" -- $cur ) )
    else
      COMPREPLY=( $( compgen -W "$commands" -- $cur ) )
    fi

    return 0
  fi

  case "${prev}" in
    @(add|list))
      local projects=$(egrep -o 'p:\w+' $TODO_FILE | sort | uniq -c | sort -rn | awk '{ print $2 }')
      local contexts=$(egrep -o '@\w+' $TODO_FILE | sort | uniq -c | sort -rn | awk '{ print $2 }')
      COMPREPLY=( $(compgen -W "${projects} ${contexts}" -- ${cur}) )
      return 0
      ;;

    @(append|del|do|prepend|pri|replace))
      local n_todos=$(wc -l $TODO_FILE)
      local tasks=$(seq 1 $n_todos)
      COMPREPLY=( $(compgen -W "${tasks}" -- ${cur}) )
      return 0
      ;;

    listpri)
      local pris=$(egrep -o '\(\w+\)' ~/Documents/todo/todo.txt | cut -c2 | sort | uniq -c | sort -rn | awk '{ print $2 }')
      COMPREPLY=( $(compgen -W "${pris}" -- ${cur}) )
      return 0
      ;;

    *)
      ;;
  esac

  return 0
}
complete -F _todo_sh -o default todo.sh

It's a bash_completion script. Copy & paste it into your own ~/.bash_completion.d/todo or the system /etc/bash_completion.d/todo.

It completes all the options and commands. It also completes @contexts and p:projects in list and add. It completes task numbers on all the commands that use them, and completes priorities in listpri.

It assumes that the todo.sh configuration file is at ~/.todo. If in your case you have it in a different place, you can set the environment variable TODOSHRC.

If you, like me, added your own alias shortcut to todo.sh, add this line after the alias definition:

complete -F _todo_sh -o default t

In my case, I was using the alias t for todo.sh. If you use another alias, replace the last t with your chosen alias.

The completion can be made smarter. Some commands need to expand to a task number and then free text, and inside the free text, expanding to @contexts or p:projects would be useful.

July 14, 2006

Folder syncronization

Status: I'm now using a recent version of iFolder for Mac OS X Intel by Boyd Timothy. Works perfectly.

Recently my wife needed a solution for folder synchronization between her laptop (mostly at home) and two other PCs at the office.

I searched a bit for a solution and I noticed two things: GDrive speculation and iFolder.

A GDrive sighting appeared in some blogs recently, in the form of a project named Platypus. The comments suggest that this might be just an internal site. The feature set looks nice, and would probably cover our needs, but only time will tell if it makes the light of day.

The second one is an old friend. I've been playing with iFolder on and off since version 1.0 or something like that. I never really used it seriously because it was pretty slow, and at the time, Linux only.

But the iFolder server was open-sourced recently (some screenshots of the iFolder web interface), so I decided to try it and see what's new.

Installing the software was very easy on a CentOS 4 server. Just follow the CentOS iFolder How-to. In my case, I already had a lighttpd running on that server, so I had to add another alias to the eth0 interface, and bind the Apache web-server that powers the iFolder server to the new alias. It works, but you need one trick: the Apache web-server must listen to the new IP and the localhost address. Apparently some parts of the server communicate via 127.0.0.1. After that little tweak, it worked fine.

The client side of things is more or less straightforward, only a small warning: right now, to use the open-source iFolder server, you must use the 3.4 clients and not the latest 3.5. Apart from that, just grab the Windows installer and run it.

On my MacBook, things are not that easy. First, Mono for Mac OS X Intel was not available at the time I tested (yes, I know that its available now), so there was no native iFolder client for me. I also tried the PowerPC iFolder client, but it does not work either. Apparently there are some problems with the Mono JIT and Roseta (lost the relevant links...).

Enter Parallels. I grabbed a DVD ISO of Fedora Core 5, created a new virtual machine and asked for a Workstation install. 10 minutes later, I was logging on to my new Linux workstation, and downloading the Mono framework and the iFolder client for Linux. After all the components where installed, I run it and I started synchronizing my folders with baby pictures with my wife laptop. Perfect!

Just today, I found (via Planet iFolder) a non-supported "if it burns your macbook don't blame me"-kind-of-thing iFolder client that should work on Intel Macbooks. I tried doesn't work yet:

dyld: Library not loaded: /Users/boyd/stage/ifolder-3.4/lib/libsimias.0.dylib
  Referenced from: /Applications/iFolder 3.app/Contents/MacOS/iFolder 3
  Reason: image not found
 Jul 14 03:25:13 Mr-Tray crashdump[21376]: iFolder 3 crashed

We have to wait a bit longer.

After all this fun, I played a bit with iFolder inside Parallels. It is very nice and fast enough for my day-to-day use. There are still some things to improve, like lack of SSL support, that should be available soon, so use this for non-sensitive data, but apart from that, iFolder worked pretty much as advertised.

I'll keep using it, to see how it goes, specially with large folders (next week, 3Gb folder of baby pictures). The idea is great, the implementation seems fine, I only need a native Intel Mac OS X client, and I'm sold.

Update: Boyd Timothy has an updated iFolder.app for Mac Intels available. I was able to install and run fine, but adding my account would always fail in the log in process with this message in the Console.app:

iFolder 3[29089] Exception in ConnectToDomain: TCP Error -1 TCP error in SimiasService.ConnectToDomain

I'll double check my installation and the Mono Framework I have, but there seem to be some bits missing from my copy of the puzzle.

Update 2: Ok, got it to work, just some problem with the password from my account. I reseted the password on the server and I was able to log in. Many thanks, Boyd!

July 12, 2006

Movie Stores

I'm a big fan of the iTunes Music Store. My main reason is that I hate waste, and I'm not a big fan of owning the jewel case in which CDs are sold. In the last few years, for every CD I bought, I just rip it into iTunes, and store the jewel case with the CD somewhere.

So iTMS is a big plus for me: I can just purchase the music I want, download it and be done with it. Its simple and it works.

But if you think ahead, with movies and TV Series, the story changes a bit. With iTMS, you download a 3 to 4 Mb file. With a movie or episode, you are talking about 300Mb minimum. It wont be possible to use the same model of download-to-own as with Music, unless laptops start coming with Terabyte-sized disk drives.

I have a lot of DVDs, and it would take between 2 and 3Tb of storage to keep all of them in digital format on my laptop. A single season of a normal 22 shows TV series is around 8Gb in size.

So I do hope that when the online movie stores show up for real, they have other business models available. In fact, I hope to see something like this:

  • You can buy the rights to see a specific show or season of shows or movie online;
  • The right is perpetual, so you buy it to own;
  • The price includes X number of downloads of said movie/episode;
  • The price might include N number of streams;
  • You can pay a recurring fee to have more streams available to you;
  • You can watch any film/episode you own and have downloaded as many times you want;
  • You can stream any film/episode you own up to the number of streams you still have in credit.

With this model, you still buy-to-own your stuff, you still can download to own your stuff, but you have another option, to stream your content. No need to have a monster disk at home.

Streaming a movie is becoming perfectly possible nowadays. Here in Portugal, the basic ADSL package is moving to 4Mb at €22, and you can have 20Mb for €35 in selected COs. Streaming a 300Mb episode costs you between 1.5 and 2.5Mb so you should be fine doing it live also.

I don't know. We see a lot of buzz about an Apple Movie store, deals between Warner Brothers and Bittorrent to distribute content, but I haven't seen anybody with a good solution for the storage problem that a online movie store creates on the clients.

Seagate and Western Digital are probably very happy now.

Blogging in a Intel Mac world

After a couple of posts, I've decided that for now I'll keep using TextMate for my blogging needs. The last time I wrote about this, Mark Papadakis suggested MarsEdit.

I know about MarsEdit, I think I'm entitled to a license because I bought a previous version of NetNewsWire that included the blog editor that eventually lead to MarsEdit.

Anyway, I think that for my personal needs TextMate is enough, no need of nothing else, and given that TextMate is mostly always running, a new post is a shortcut away.

In case you are feeling curious about the blogging bundle, you should check the screencast. Even if you are not curious, I would suggest that fast forward to 6:13 and see how easy it is to include a picture on your posts. Very cool stuff that you can do with TextMate bundles.

Anyway, I'll be looking out for a Universal version of ecto, but until then, TextMate it is.

(BTW, if you how to write a URL to a QuickTime movie with a specific start time, please let me know. I know it is possible with a local <embed> tag, specially crafted, but I was looking for something like http://your.site/quicktime.mov?start=5:32-style of URL.)

July 11, 2006

Apple Dashboard E.T.-style hobby

Apparently E.T. is not the only one that is trying to phone home.

Although it appears that no personal information is being sent, I don't believe that is the point.

The main point is that the release notes of 10.4.7 suggest this feature as an option at install time, not a recurrent event.

Personally, I would prefer a better explanation on the release notes and a documented way to turn it off. The first one is a bit too late, but the second one is already documented at mac os x hints website.

On-line status is up

Wildfire 3.0 is out, and the upgrade of simplicidade.org went very very well.

The new plugin interface is very nice and I installed the on-line presence plugin. After a bit of lighttpd-config magic, I added a new domain, presence.simplicidade.org, as a public gateway to it.

The relevant lighttpd.conf setup is this:

$HTTP["host"] == "presence.simplicidade.org" {
  server.document-root = "...."
  accesslog.filename   = "...."

  server.indexfiles    = ( "index.html" )

  url.rewrite-once = ( "^/(.*)" => "/plugins/presence/status?jid=$1@simplicidade.org" )

  proxy.server = ( "" => ( ( "host" => "127.0.0.1", "port" => 9090 ) ) )
}

So to see my presence, you can write http://presence.simplicidade.org/melo, and you'll get a small image with my status. To put it in a HTML page, you would just:

<img src="http://presence.simplicidade.org/melo" />

Nice and simple. If you have an account at simplicidade.org, feel free to use it.

Back

I'm back, after 5 weeks of vacation/new baby stuff.

I'm mostly up-to-date at work, but I have a lot of little items marked in my NetNewsWire to write about.

Having kids is great, but it channels most of your free time.