Some thoughts about PHP

I have been using PHP1 as my primary language for five months now so I feel somewhat qualified to speak about it. My overall conclusion is that PHP is weak sauce. It is easy to get started with PHP but it’s usefulness decreases as the complexity of the application increases. This is primarily because keeping PHP code maintainable requires unnatural levels of discipline. Choosing PHP for a greenfield project is a technical risk that is unnecessary in today’s rich web application ecosystem. Of course, most projects aren’t greenfield so there are plenty reasons to have PHP around.

I find PHP to be a deeply frustrating environment in which to work. It seems to have occurred, as opposed to designed or even evolved, without much of an overarching vision. In “The Mythical Man-Month” Fredrick Brooks claims that “conceptual integrity is the most important consideration in system design”. Unfortunately, PHP has very weak conceptual integrity. It seems mostly to be a collection of decisions that seemed expedient at the time with little thought about how that would impact the over all system.

I remember hearing a few years ago about PHP is syntax to appear more Java like. That was back when it looked like there was going to be a real Java hegemony in business programming. At the time I thought it was a slightly odd but defensible idea. Today I see that decision as an indicator of weak conceptually integrity. I think any system willing to give up its character so completely must lack, almost by definition, the conceptual integrity needed to be great.

I covered several concrete issues with PHP in Early impressions of PHP. All of those issues still stand but my biggest problem with PHP, after working with it for a while, is that is seems designed to actively discourage meta-programming. This means I find myself writing annoying amounts of boiler plate code2. I strongly believe that the future of programming is language-oriented. This make PHP hard for me because even rudimentary language oriented techniques are simply not feasible in PHP.

A Somewhat more minor annoyance is the lack of closures and blocks. I first learned blocks and closures about two years ago and now find programming without them mildly painful. I think that Mark Jason Dominus got it right when he said

in another thirty years people will laugh at anyone who tries to invent a language without closures, just as they’ll laugh now at anyone who tries to invent a language without recursion.

There are just so many common classes of problem that are simply and cleanly solved by closures that not having them seems like a crime. I hope it does not take thirty years, though.

It would not be fair to leave this post without a discussion of the good things about PHP. PHP excels at lowering the barrier to entry. There is no other system I am aware of that even come close the ease of getting start with PHP. The weakness of PHP’s conceptual integrity does not seem to noticeably impact productivity in the context of small systems. The idea that you can have a web application by creating one text file and copying it to the web server is radically powerful.

And then there is Smarty. Smarty is a really nice external DSL for generating web pages, i.e. a template engine. The core of Smarty is well thought out and has very nice extension mechanisms. I is a joy to work with. If you are doing PHP work I can strongly recommend Smarty.

  1. As always, I need to point out that I am speaking primarily about PHP 4. I suspect that PHP 5 suffers from many of these issues but I have spent very little time in PHP 5.

  2. Usually this boiler plate code gets written after I have already spent an inordinate amount of time tyring and failing to automate it. The usable area in PHP seems remarkably small. That means it is going to take a bit more running into the edges before I completely accept that they are really that close.

Nicer phpDoc Comments

Some days it is the little things that make you happy. This is a story of how a little elisp1 made me happy today.

We use phpDocumentor to extract the documentation from our PHP code. It is a workable solution but the format is a bit hard to read in plain text. The examples on the phpDocumentor website are all syntax highlighted which makes them easy to read. Without the fancy syntax highlighting, however, it is quite difficult to detect where the meta-data ends and where the descriptions start. To see what I mean look at this example (copied from the phpDocumentor tutorial):

 * Example of unlimited parameters.
 * Returns a formatted var_dump for debugging purposes
 * @param string $s string to display
 * @param mixed $v variable to display with var_dump()
 * @param mixed $v,... unlimited number of additional variables to display with var_dump()
function fancy_debug($s,$v)

That does, in fact, document the method but, man, is it hard to read in plain unformated text. Even when you gussy it up with some pretty colors it is not really all that readable. I think some separator characters that where printable would have been a really useful addition to the format, say like a ’:’ between the meta-data and the description. However, you can make that block a lot easier to read just by inserting some newlines, which phpdoc handles well.

 * Example of unlimited parameters.
 * Returns a formatted var_dump for debugging purposes
 * @param string $s 
 *   string to display
 * @param mixed $v 
 *   variable to display with var_dump()
 * @param mixed $v,... 
 *   unlimited number of additional variables to display 
 *   with var_dump()
function fancy_debug($s,$v)

Much more readable. The problem is, I am an fill-paragraph addict. I type M-q about as often as I breath when I am writing text. fill-paragraph (M-q is the keyboard shortcut for fill-paragraph) is the Emacs function that will optimally layout a paragraph of text. It splits lines that are too long and merges lines that are too short, removes superfluous spaces, etc. The problem was that fill-paragraph thinks that the ”@param ...” bit is part of the same paragraph as the description so it (un)helpfully merges those two lines leaving the same jumble of text we started out with.

This has been bugging me for weeks, and today I fixed it. See, fill-paragraph detects the boundaries between paragraphs by using a couple regular expressions that you can configure. So I buckled down and created a paragraph boundary pattern that make Emacs correctly treat @param, or any other phpDocumentor tag that appears at the beginning of the line, as a paragraph separator line. Now fill-paragraph leaves the description on line following the meta-data where it belongs.

Here is the code2. You can just paste this into your .emacs file any place after (require 'php-mode) line. Now that I think about it, this same regexp pattern should work for Javadoc, and probably a lot of other document extraction tool formats. You would, of course, need to change the hook, though.

(defun php-doc-paragraph-boundaries () 
  (setq paragraph-separate "^[ \t]*\\(\\(/[/\\*]+\\)\\|\\(\\*+/\\)\\|\\(\\*?\\)\\|\\(\\*?[ \t]*@[[:alpha:]]+\\([ \t]+.*\\)?\\)\\)[ \t]*$")
  (setq paragraph-start (symbol-value 'paragraph-separate)))

(add-hook 'php-mode-user-hook 'php-doc-paragraph-boundaries)

Having M-q do the right thing in PHP documentation blocks makes me happy. Probably inordinately so. But like I said, sometimes it is the little things.

  1. elisp is the Lisp dialect in which GNU Emacs is written. It is also the language used to customize Emacs.

  2. I have to say that I despise the way Emacs handles regexen. The profusion of backslashes make regular expressions really hard to understand. And having some regexp operators require escaping and others not is really annoying. The only other issue I ran into is that the paragraph-start and paragraph-separator patterns must match the entire line. Just matching part of the line is not sufficient. The documentation on paragraphs in Emacs does not explicitly state that.