A quick tip, which I haven’t seen discussed online before, for what to do if you have a site which a bunch of static HTML pages to which you need to add something dynamic to, such as the output generated by PHP Documentor, without having to modify the HTML files themselves.

1)

So you’ve got a ton of .html pages and someones demanding, for example, a user feedback system to allow inline comments to be displayed on the pages. The kind of output generated by PHP Documentor perhaps or a site like XUL Planet (which desperately needs reader feedback IMO).

Making this output dynamic with PHP is going to be a major headache right? You’ve got to strip out all the content and store it in a database then re-construct the layout with PHP. Only then can you start thinking about adding anything new. And what about all those URLs which Google has nicely indexed for you? Alot of work.

There is another way which may, in many cases, prove very fast to implement and takes advantage of PHP‘s native integration with Apache, the output control functions plus a couple of more obscure php.ini settings.

Step 1. Apache Configuration

First thing you need to do is to tell Apache to parse the HTML pages with PHP. You either need to update httpd.conf (if you have rights to it) or a .htaccess file (assuming Apache is configured to let you use them) containing something like;

<Files *.html>
ForceType application/x-httpd-php
</Files>

Step 2. PHP Configuration

Next is two, often overlooked, php.ini settings - auto_prepend_file and auto_append_file. What these do is “attach” a PHP script before (prepend) or after (append) the current script which is being executed by PHP. Think you can see where this is going now...

Using a .htaccess file to set these, you might have;

php_value auto_prepend_file /home/username/before.php
php_value auto_append_file /home/username/after.php

Note it’s probably a good idea to switch off the short_open_tag setting as well, in case the pages contain anything like an XML processing instruction like .

Step 3. Output Control

Now you need to “catch” the HTML before is gets served to the browser for which PHP‘s output control functions are perfect.

Here’s what the files that are being prepended and appended might include;

<?php
// before.php

// Start output buffering
ob_start();
?>

That gets executed before the orignal HTML page is “parsed” by PHP.

<?php
// after.php

// Store the contents of the buffer in a variable
$page = ob_get_contents();

// End and clean the buffer
ob_end_clean();

/**
* Manipulate the $page variable here
*/

// Display the page
echo $page;
?>

The after.php “captures” the HTML as a string into the $page variable, allowing you to manipulate it in some manner, before it gets sent to the browser.

And that’s basically it. No HTML files are directly modified. All URL‘s remain the same.

level=”leve”>Step 4. Where now? The rest - actually manipulating the contents of the page - is going to be problem which will need to be solved on a per-case basis.

Some general thoughts;

  • In manipulating the “captured” page, keep it as simple as possible (e.g. str_replace() first, regular expressions second, PEAR::XML_HTMLSax or the DOM extension’s new HTML functionality if you get desperate).
  • Look for “layout” HTML which appears regularily on every page such as a horizontal line used at the bottom of every page and use this as the point to insert the dynamically generated content.
  • URL‘s make great unique values for a database ($_SERVER[’REQUEST_URI‘]). Using the existing URL structure it should be relatively easy to identify database content to be “attached” to a given HTML page.
1) Note: you’ll probably either find this tip blindly obvious or realise why PHP stands for “Hypertext Preprocessor”

develop/bringing_static_html_to_life_with_php.txt · Last modified: 2005/10/15 21:47