PHP4 Iterators Explained

NOTE to PHP5 Users: iterators are built in!

The common PHP application will generally have this layout:

  1. Query the database
  2. Loop over the database result set, and on each loop, add the result to a large array of data.
  3. Loop over that large array of data and format it. (Most application will just skip this step and do it in the previous one)
  4. Set the big formatted array of database information to the template
  5. The template engine loops over the array and displays the appropriate thing for each row.

Woah. That's a lot of looping! Now, before I even tell you how iterators work, I'm going to show you how they can change your application. Here is how all of the above would work with a setup using iterators:

  1. Query the database.
  2. Use an iterator to deal with the database result set. On each iteration, a row will be taken from the database result.
  3. Use another iterator on top of the database result set iterator to format the data. On each iteration, the row will be formatted.
  4. Set the data formatting iterator, which has the database result set iterator in it, to the template.
  5. The template loops over the iterator for the first time, in each iterator the result row is fetched from the database, then formatted, then displayed.

At this point you might be thinking, your list of how the iterator works is longer and more complicated. Not so! You might also not see the benefit of the iterator yet either, so let me explain them in more detail...

An iterator works on the principle of only getting a row (be it a row of an array or database result set, etc) when it is needed. Now that is a pretty ambiguous statement. You might be thinking: that's what my foreach / for loops do! Well, before I get into the code, let's look at how the functions of an iterator should work:

Functions:

current: this function will return the current row. It is very important that you realize this. 'current' is passed the current row and it will return it. next: this function will tell the iterator to go to the next row. If there is another row, it will return TRUE, otherwise it will return FALSE. hasNext: this function is used in 'next'. It will return a boolean of whether or not another row exists. key: return the current index of the array/result set that we are on. reset: reset the iterator back to the start of the array / database result set.

Variables:

_index: this starts off at -1. This is VERY important because the way you use an interator is that you would do: if($it->next()) ... The 'next' function, as described above, will tell the iterator to go to the next index or not. It tells the iterator by using the 'hasNext' function and then by incrementing the '_index' variable. So, on the first iteration, '_index' will be incremented to 0 (zero) representing the first row of the array / result set. _array: This is ONLY for an array iterator. This is a reference to the array that we are iterating through. _size: Again, this is ONLY for an array iterator. This is the count() or sizeof() (whichever you prefer) of the array. It is used to in 'hasNext' to see if we can still iterate through the array. _it: This is ONLY for a proxy iterator (will be explained). This is a reference to the iterator that we will be iterating through.

Okay, read over all of that once more. That explains the inner workings of an iterator, but it still might not make the whole idea 'click' with you. So here's and example of an ArrayIterator in use:

// Create the array that we will want to iterate. 

$array = array(
		array('name' => 'Peter', 'age' => 18),
		array('name' => 'David', 'age' => 30),
		);

// pass the array into the iterator.  

$it = &new FAArrayIterator($array);

// loop over the array.  

while($it->next())
{                       
	// get the current row in the array as
	// $temp
	
	$temp = &$it->current();
	
	// this will separately print the two
	// arrays in $array.         
	
	print_r($temp);
}
The comments are pretty straightforward, but to really understand what's going on, you should scroll back up and read the description of the functions again. Now, as described above, the 'next' function tells the iterator to go to the next row and will return a boolean value if the next row exists. So, here is a little table that describes what's going on:
statuswhat '_index' becomeswhat 'hasNext' returnedwhat 'next' returnedwhat 'current' returned
instanciated-1
loop 10TRUETRUEarray('name' => 'Peter', 'age' => 18)
loop 21TRUETRUEarray('name' => 'David', 'age' => 30)
loop 31FALSEFALSE

So, you'll be wondering why there's a loop 3. There isn't actually a loop 3, but 'next' is called three times. The reason is simply how while() works. If the statement inside a while( statement in here ) loop returns TRUE, then it will continue to loop. If it returns FALSE, then it will stop. This all means that 'next' will be called, and when it returns FALSE, then the loop will stop.

On to the code! Here is the basic iterator class. All iterators will in one way or another extend this class. In PHP5, I think the iterator class is actually an interface or abstract class, but I'm not too sure about that.

class FAIterator {
	function ¤t() {
		assert(FALSE);
	}
	
	function hasNext() {
		assert(FALSE);
	}
	
	function key() {
		assert(FALSE);
	}
	
	function next() {
		assert(FALSE);
	}
	
	function reset() {
		assert(FALSE);
	}
}

Obviously, this class is never meant to be directly used, hence the assert(FALSE);. So, first I will show you what the code in the ArrayIterator looks like. The I will go and explain ProxyIterators.

class FAArrayIterator extends FAIterator {
	var $_array;
	var $_index = -1;
	var $_size = 0;
	
	function FAArrayIterator(&$array) {
		assert(is_array($array));
		
		foreach ($array as $key => $value) {
			$this->_array[] = &$array[$key];
			$this->_size++;
		}
		
		$this->reset();
	}
	
	function ¤t() {
		if (!isset($this->_array[$this->_index])) {
			trigger_error("Array out of bounds", E_USER_ERROR);
		}
		
		return $this->_array[$this->key()];
	}
	
	function hasNext() {
		return ($this->_size > $this->_index + 1);
	}
	
	function key() {
		return $this->_index;
	}
	
	function next() {
		if ($ret = $this->hasNext()) {
			$this->_index++;
		}
		
		return $ret;
	}
	
	function reset() {
		$this->_index = -1;
		return TRUE;
	}
}

Now, there is a foreach statement in that array iterator's constructor, and you might be thinking: "Peter, you said that we would avoid extra loops!" Well, I was mainly talking about the use of iterators to get database result sets, which have no extra loops. Don't worry, I will show them too.

So, the array iterator seems simple enough, now onto the really cool iterator: the ProxyIterator. Simply put, the ProxyIterator iterates through an existing iterator. That's it! What makes the ProxyIterator so cool is that you can stack classes that extend the proxy iterator. I will give an example later, but first the code:

class FAProxyIterator extends FAIterator {
	var $_it;

	function FAProxyIterator(&$it) {
		assert(is_a($it, 'FAIterator'));
		$this->_it = &$it;
	}

	function ¤t() {
		return $this->_it->current();
	}

	function hasNext() {
		return $this->_it->hasNext();
	}

	function key() {
		return $this->_it->key();
	}

	function next() {
		return $this->_it->next();
	}

	function reset() {
		return $this->_it->reset();
	}
}

If you look at the code, all the ProxyIterator does is calls the methods of the iterator passed to it!

Now, lets say that we have an array of names and ages (same array as the above example). We want to make the names bold and make the ages italicized. So, let's use iterators! (NOTE: this is simply an example of stackable iterators. As a real world scenario, this is terrible usage of iterators)

class MakeNamesBold extends FAProxyIterator
{
	function MakeNamesBold(&$it)
	{
		parent::FAProxyIterator($it);
	}
	function ¤t()
	{

		$temp = &parent::current();
		
		$temp['name'] = '<strong>'. $temp['name] .'</strong>';
		
		return $temp;
	}
}

class MakeAgesItalic extends FAArrayIterator
{
	function MakeAgesItalic(&$it)
	{
		parent::FAArrayIterator($it);
	}
	function ¤t()
	{
		// get the current row from the parent iterator

		$temp = &parent::current();
	
		$temp['age'] = '<em>'. $temp['age] .'</em>';
		
		// return the formatted row

		return $temp;
	}
}

$array = array(
		array('name' => 'Peter', 'age' => 18),
		array('name' => 'David', 'age' => 30),
		); 
				
$it = &new FAArrayIterator($array);
$it = &new MakeNamesBold($it);
$it = &new MakeAgesItalic($it); 

while($it->next())
{                       
	$temp = &$it->current();
	print_r($temp);
} 

First, we create the array and then pass it into the ArrayIterator. The we pass that iterator into the MakeNamesBold iterator. Finally, we pass that iterator into the MakeAgesItalic iterator. Now, what goes on in each loop of the while() is very interesting:

ArrayIterator::current is called. It returns the vanilla array with a person's name and age.
MakeNamesBold::current is called. It makes the person's name bold.
MakeAgesItalic::current is called. It makes the person's age italic.

Now you might be thinking, that those functions are called in the wrong order! MakeAgesItalic is essentially now the iterator being used in the while() loop, so its current function will be called first. That is partly true, MakeAgesItalic::current is called first, but MakeAgesItalic calls parent::current! And guess what? parent::current refers to MakeNamesBold::current! Yes, it's true! Finally, MakeNamesBold::current calls parent::current, which is FAArrayIterator::current. This is the beauty of iterators: they are stackable.

So all this time I have been talking about database result set iterators. These are most definetely the best use of iterators, mainly because the ArrayIterator does an extra loop in its constructor (and you might be thinking why not just use array_values in the ArrayIterator constructor, the point is to maintain references to each row ;) ) and so it kind of goes against the whole usefulness of iterators.

The real power is with the ProxyIterator, but it requires that an iterator be passed to it. Sofar, the only other iterator you have is an ArrayIterator, so now I give you the MySQLResultIterator!

class MysqlResultIterator extends FAIterator {
	var $id;
	var $mode;
	var $row = -1;
	var $current;
	var $size;

	function MysqlResultIterator($id, $mode) {
		$this->id = $id;
		$this->mode = $mode;
		$this->size = mysql_num_rows($this->id);
	}

	function ¤t() {
		return $this->current;
	}

	function hasNext() {
		return ($this->row + 1 < $this->size) ? TRUE : FALSE;
	}

	function key() {
		return $this->row;
	}

	function next() {
		$ret = $this->hasNext();

		if ($ret) {
			$this->current = mysql_fetch_array($this->id, $this->mode);
			$this->row++;
		}

		return $ret;
	}

	function free() {
		return mysql_free_result($this->id);
	}

	function numRows() {
		return $this->size;
	}

	function reset() {
		if ($this->row >= 0)
			mysql_data_seek($this->id, 0);

		$this->row = -1;

		return TRUE;
	}
}

To this iterator, you pass the result of a mysql_query and the mode (e.g.: MYSQL_ASSOC), and then you can use it as if it were a normal iterator. You can stack it with other ProxyIterators, etc!

Where are iterators useful? I don't remember if I covered this, but iterators are generally useful in: database abstraction layers, formatting database result sets and templating.

EDIT: using iterators to format data ISN'T actually the best use of this as was suggested to me at OneCommune. The Decorator pattern is more appropritate for this.

Comments

[...] « PHP4 Iterators Explained [...]

[...] PHP4 Iterators Explained: explains the basis of iterators (which have become standard in PHP 5), including examples. [...]

Apart from the ability to stack them, I can't see much advantage using these. All the extra classes can make your code messy too.

    by Lewis on Aug 28, 2006 @ 4:22pm

Add a Comment