The Registry provides a mechanism for storing data globally in a well managed fashion, helping to prevent global meltdown. This article discusses building a registry as well as an insight into the mind of a test infected developer (Test Driven Development).
The problem with application data
Avoid global data! It leads to the most insidious of errors, that of action at a distance. Because a global variable is accessable from anywhere in the program it means that anything could accidently overwrite it. If a global value appears to have changed, what part of the code changed it? Some tools can help here. With a sophisticated debugger you can set a watch on a variable that drops you into a nice window, with a cursor on the offending code. This is helpful when it works, but it assumes that the error is repeatable. In fact it assumes that the error was discovered at all. The problem with testing being your only safeguard is that it is virtually impossible test every route through the code. The well known solution to this is to divide and conquer. We reduce the code to encapsulated chunks so that clear design, fine grain unit testing and rigorous inspection can all be brought to bear. Global data defeats all of these and for this reason we never use it.
The catch, of course, is that things are never that simple. Some data really is global. Well not exactly global as such, but unique to the running application.
An example could be a cached database connection. It is normal that a connection object is created in one part of the application, but used in a completely different part. Many examples in PHP expect the database connection object to be available in the global namespace e.g.;
class ProductFinder { function findByName($name) { global $db; // Secretly access global instance $sql = "SELECT * FROM products WHERE name='$name'"; return $db->query($sql); } }
This ties the Products class to the database class in a way that is not obvious unless you happen to see this piece of code. If such invisible entanglements are painful when writing applications, they are positively rude when writing libraries.
Another approach is to have the connection object could passed around once created, but this will add an extra parameter to every intermediate class or method it must pass through. For example;
class ProductView { var $_db; function ProductView(&$db) { $this->_db = &$db; } function showProductsByName($name) { $products = &new ProductFinder($this->_db); $result = &$products->findByName($name); $this->paintAsTable($result); } }
That’s clutter we could do without. Worse, some of these methods that we pass through will have nothing to do with database connections at all, and yet we will have to check them if we have a bug. Weren’t we trying to avoid this kind of hassle when we move from global access to passing as a parameter? Things get even worse if there are several such application objects being passed around.
Despite these drawbacks, passing the data is still preferable. Classes that use globals are difficult to reuse in other applications (what if you get a name clash?) and difficult to test as test cases can interfere with each other in unpredictable ways.
Can we make the preferred system of passing of application data more manageable?
A registry class
The registry is just a simple object bundle. In fact it’s simplest form is a hash (associative PHP array) populated with objects. This is a reasonable start, but you may want a registry that is a composite of other registries, or perhaps one that can have it’s behaviour changed for testing, for which you’ll need a class. The may sound like too much abstraction but if you jump in with a straight forward hash, you will have a lot of code to edit later when you change your mind.
For this reasons we’ll code the registry as an object from the start. Modern software development practice tells us we should test as we go, thus we express it’s functionality as a test case (test driven development)...
class RegistryTestCase extends UnitTestCase { function RegistryTestCase() { $this->UnitTestCase(); } function testAccess() { $registry = &new Registry(); $this->assertFalse($registry->isEntry('a')); $this->assertNull($registry->getEntry('a')); $thing = 'thing'; $registry->setEntry('a', $thing); $this->assertTrue($registry->isEntry('a')); $this->assertReference($registry->getEntry('a'), $thing); } }
Here I am using a test tool to drive the script, but you can just as easily hack your own with a few print statements. The registry interface is still basically a hash, but one where access is controlled with the method rather than the square brackets (i.e. ) you would get with a hash.
The code to pass this test is (i.e. Registry itself)...
class Registry { var $_cache; function Registry() { $this->_cache = array(); } function setEntry($key, &$item) { $this->_cache[$key] = &$item; } function &getEntry($key) { return $this->_cache[$key]; } function isEntry($key) { return ($this->getEntry($key) !== null); } }
From now on when we create an application wide object, instead of passing it back directly we insert it into a registry with . Besides putting all of our application objects into a single parameter we also win because the registry usually only needs to be passed down from the application layers above. This is far fewer messages than with direct communication.
We can maintain encapsulation throughout all of this with the access keys. By using the PHP function we can set up constants that are only known to the portions of code with an interest in the registered object. This can pretty much eliminate whole swathes of suspect code in the event of a bug. A slightly safer approach would be to write a registry specifc to your application and used named accessors, but I am keeping things simple for now.
A singleton registry with testing
If the registry is still being passed around too much, the obvious solution is to make it global. By now this should sound like a bad idea and generally it is. The whole point of this exercise though is to make our code easier to understand, something we may not achieve by adding an extra registry parameter to almost every method. There may also be other restraints such as callbacks into a framework not allowing any extra parameters, such as the registry, to be passed. If this is what is happening for you then here are your options...
Change the design. It is possible that some responsibilities are in the wrong classes and a bit of shuffling about may resolve the issue.
Is it really that bad? What I am about to suggest has complications of it’s own. It is likely that your current situation is already optimal.
Allow the registry to act as a singleton.
Because we have some protection from the hash keys or method names this is less damaging than the individual application objects being singletons. Initially it appears to be only a small backward step.
I’ll assume that you are familiar with the singleton pattern. The singleton semantics in the form of our test case are...
class RegistryTestCase extends UnitTestCase { ... function testSingleton() { $this->assertReference( Registry::instance(), Registry::instance()); $this->assertIsA(Registry::instance(), 'Registry'); } }
Implementation of this is straight forward once you have grasped the rather quirky PHP4 references...
class Registry { var $_cache; function Registry() { $this->_cache = array(); } function setEntry($key, &$item) { $this->_cache[$key] = &$item; } function &getEntry($key) { return $this->_cache[$key]; } function isEntry($key) { return ($this->getEntry($key) !== null); } function &instance() { static $registry; if (!$registry) { $registry = new Registry(); } return $registry; } }
There is a really big problem though.
When we come to test our application we are going to want each test to start in a clean state. This is impossible with our current arrangement. If one test class sets up the registry with, say, a fake database connection and is the only one that goes on to use it there will be no problem. If another test is added, and it involves a class that will create a new connection if one does not exist, then we are in trouble. The class will see the previous fake entry and use that instead. This could have all sorts of unforseen consequences invalidating our tests.
And so there is one final workaround we must add for testing. These are and methods to obtain a fresh global registry for the test and then undo the damage immediately afterwoods. In test case terms...
class RegistryTestCase extends UnitTestCase { ... function testSaveAndRestore() { $registry = &new Registry(); $a = 'a'; $registry->setEntry('a', $a); $registry->save(); $this->assertFalse($registry->isEntry('a')); $b = 'b'; $registry->setEntry('a', $b); $this->assertReference($registry->getEntry('a'), $b); $registry->restore(); $this->assertReference($registry->getEntry('a'), $a); } }
This complicates the registry class somewhat because we must add an internal stack...
class Registry { var $_cache_stack; function Registry() { $this->_cache_stack = array(array()); } function setEntry($key, &$item) { $this->_cache_stack[0][$key] = &$item; } function &getEntry($key) { return $this->_cache_stack[0][$key]; } function isEntry($key) { return ($this->getEntry($key) !== null); } function &instance() { static $registry = false; if (!$registry) { $registry = new Registry(); } return $registry; } function save() { array_unshift($this->_cache_stack, array()); if (!count($this->_cache_stack)) { trigger_error('Registry lost'); } } function restore() { array_shift($this->_cache_stack); } }
Now we can run tests pretty safely...
class MyTest extends UnitTestCase { function MyTest() { $this->UnitTestCase(); } function setUp() { $registry = &Registry::instance(); $registry->save(); } function tearDown() { $registry = &Registry::instance(); $registry->restore(); } function testStuffThatUsesTheRegistry() { ... } }
So have we conquered the application data problem? Well, not really. Using the singleton form of the registry is slightly more complicated for testing and we are in slight danger of a name clash on our registry keys. All we really have is a sliding scale of options to play with.
The test safe singleton
Finally there is one bonus with the singleton registry if you are improving or porting legacy code that already uses singletons. It is much easier to replace them with a testable version of the singleton using our new registry rather than refactor them out of all the code.
The actual I am going to use is an (abstract) base class. We would like to inherit from it to get our specific singletons. The usual singleton behaviour can be summed up yet again with a test case...
class ExampleSingleton extends Singleton { function ExampleSingleton() { $this->Singleton(); } function &instance() { return Singleton::instance(__CLASS__); } } class SingletonTestCase extends UnitTestCase { function SingletonTestCase() { $this->UnitTestCase(); } function testUniqueness() { $this->assertReference( ExampleSingleton::instance(), ExampleSingleton::instance()); } }
Driving this with the class rather than the static variable we used before yields...
class Singleton { function Singleton() { $registry = &Registry::instance(); if ($registry->isEntry('singleton ' . get_class($this))) { trigger_error( 'Already an instance of singleton ' . get_class($this)); } } function &instance($class) { $registry = &Registry::instance(); if (!$registry->isEntry('singleton ' . $class)) { $registry->setEntry( 'singleton ' . $class, new $class()); } return $registry->getEntry('singleton ' . $class); } }
By stashing the registry during tests we make sure our singleton acts as a singleton only when we want it to. Testing this we find that it works as advertised...
class SingletonTestCase extends UnitTestCase { ... function testRegistrySaveAndRestore() { $registry = &Registry::instance(); $singleton = &ExampleSingleton::instance(); $registry->save(); $this->assertCopy( ExampleSingleton::instance(), $singleton); $registry->restore(); $this->assertReference( ExampleSingleton::instance(), $singleton); } }
Because we can save and restore the whole registry we can be sure that no unknown singletons are carried from test to test. This goes a long way to isolating the sections of the application you are most interested in, whilst still having the convenience of global access.
Further Reading
Martin FowlerPatterns of Enterprise Application Architecture (highly recommended purchase) as well as a short description here.