Building notes, projects, and occasional rants

Fixing the POD synopsis in OSX – take 3 (take your groff and run)

Marcus started it, Tim teased me. I was bitten so many times by this that I had to take a stab at it.

Following Tim's leads, I checked that the pod2man was producing proper nroff, with a \- for each -. It was. I tried to understand the groff tmac files, but I think we have here the first real proof that there are aliens out there, and they speak wonderful languages...

Anyway, we turned to Google and after a bit of digging we ended up at the groff CVS and a recent change (rev 1.39), just 4 months old so recent enough not to be included with Mac OS X. The description was promising:

tmac/an-old.tmac, tmac/doc.tmac: For -Tutf8, map -, -, ', and ` conservatively to ASCII for the sake of easy cut and paste.

The doc.tmac is important. When pod2man calls nroff, it asks for the an package. The an.tmac basically includes the andoc.tman and that one includes the doc.tman package.

At first I took the diff and tried to blindly apply it to the local doc.tman file. I don't speak alien, so although nroff didn't complain after my changes, it also keep on using the unicode hyphen symbol.

So I've downloaded the latest groff package (1.20.1) and did:

tar zxf groff-1.20.1.tar.gz
cd groff-1.20.1
./configure --prefix=$HOME/bin/groff-1.20.1 \
make -j 4
make install

You now have your own local groff install, including a brand new nroff.

To test it, I run:

perldoc -n ~/bin/groff-1.20.1/bin/nroff local::lib

Copy and paste something in there with hyphens, like my nemesis --bootstrap, and you should see that your hyphens stay in glorious ASCII, no more of that unicode mumbo-jumbo.

So stick this into your .bashrc:

alias perldoc='/usr/bin/perldoc -n ~/bin/groff-1.20.1/bin/nroff'

And live long and prosper.

Maybe it is possible to take the changes and port them successfully to groff 1.19.2, but I couldn't do it. If you do speak alien and you do port them to the groff shipped with Mac OS X 10.5, leave me a comment. Sticking a new doc.tmac in the `site_tmac/´ directory is a lot simpler than installing groff.

Update: the PROBLEMS file (kudos to this thread where you can follow the whole argument with the same problem in Linux man pages) that is included with groff mentions this problem:

  • The UTF-8 output of grotty has strange characters for the minus, the hyphen, and the right quote. Why?

The used Unicode characters (U+2212 for the minus sign and U+2010 for the hyphen) are the correct ones, but many programs can't search them properly. The same is true for the right quote (U+201D). To map those characters back to the ASCII characters, insert the following code snippet into the `troffrc' configuration file:

.if '\*[.T]'utf8' \{\
.  char \- \N'45'
.  char  - \N'45'
.  char  ' \N'39'

If you stick the above code into /usr/lib/groff/site-tmac/troffrc the output will be ASCII, even with the default Mac OS X groff, but the perldoc output starts with a couple of blank pages and a warning:

<standard input>:138: warning: can't find font `CW'

So compiling groff is still the best solution.