Accept to gettext

In 2003, I needed to write a PHP site that would need to be internationalized, and decided to use gettext() for this purpose. While PHP has proper support for gettext, it unfortunately does not have any builtin way to easily convert the data in the Accept-Language and Accept-Charset HTTP headers into something that can be understood by gettext.

So I sat down, and over the course of an afternoon, I wrote 83 lines of PHP code (excluding comments) that would do just that: parse the above two headers, and return a gettext string which represents the best match out of a set of strings specified. It's pretty easy in use, and was written quite generically, so I put a GPL tag on top of it, put it online, added a comment to the documentation of the gettext stuff in the PHP manual advertising the URL, and stopped thinking of it. The project that I wrote this file originally for eventually wasn't even finished; today, I don't even recall what it was.

In early 2005, I received a mail from Matthew Palmer thanking me for the code, and telling me that he'd incorporated it in IRM of which he's part of the development team. Which reminded me of its existance. For a short while.

Today, I was cleaning out my INBOX, and stumbled upon this old email again. Which got me curious, so I started googling for those 83 lines of code that I'd entrusted to the Free Software world three years ago. I must say I'm pleasantly surprised. My file accept-to-gettext.inc seems to be used in a variety of free software projects, including of course IRM, but also the GnuCash website, or in this french educational thing called SLIS. It seems to be quoted in its entirety on some turkish PHP-related forum. It has been adapted to support different translation systems. And so on... the list is rather long.

Can't say I'm not proud. And yet, the code contains a rather silly bug: the assumption that character set encodings are somehow linked to translations. They are not; gettext is perfectly capable of transparently transcribing one character set to another.

Oh well...