Archives for 2007
Almost finished exams, and…
I'm almost finished exams (for this semester, one to go) and I have been mulling over the idea of making my own programming language. I've always liked this idea; however, I have never been able to do it, nor did I know any languages that would have made the task of interpreting it fast.
I took one computer science course this semester; I got a decent overview and some useful experience working with C. Believe it or not, I actually like C! Unfortunately, I do everything by the command line and don't have any cool debugger for it so when bus errors or segmentation faults happen they are a pain in the ass to find. Oh well. On to what I really want to talk about: the idea behind the language I currently want to make.
First, it is not object-oriented. It is a lot like javascript in terms of javascript's functional nature, but then it still has a bunch of procedural language constructs built in. Second, you can make object-like entities in it, and I will show how. Third, the core concept in the language is procedures; encapsulated code that is run every time it is called on. A function is simply a procedure with arguments. Here is a steady build up of the syntax. You will notice little things that I've taken from the languages I love.
a = 10 // simple variable definition
a => 10 // simple procedure definition
add_two(x) => x+2 // simple function definition
// more complex function
factorial(x) => do:
if x > 1:
return x * factorial(x-1)
;
return 1
;
Okay, so what is the point of the "=>" symbol? That symbol signifies a implicitly returning procedure. Any code in a procedure is executed every time it is called on. If we had a variable that was a procedure to a function call then every time you use the variable that function would be called. Contrast this with having a variable equal a function call; every time you use that variable you will be using the result of the original function call. This distinction is amazingly important and lets one transparently do some really cool things.
Knowing that the procedure is an implicit return, the factorial function above makes sense. The implicit return is a bit disingenuous, it might be better put that a procedure can only have one expression in it. An expression could be doing some math, which has a return value (and so it implicitly returns) or it could be a language construct such as "do" which allows us to encapsulate further expressions. Here are four ways to write the same function:
// Syntax for quickly typed simple functions and more involved functions by
// returning a language construct
add_two(x) => x+2 // simple named function, implicit return
// Named function that uses the "do" language construct to allow for more
// expressions/statements.
add_two(x) => do:
x = x+2
return x
;
// example of lambda (anonymous/unnamed) functions
add_two = (x) => x+2
// example of a lambda within a procedure
add_two => (x) => x+2
Okay, so I am making slow progress showing why the procedure approach is so cool. The last example of the add_two function above hints at it though. When add_two is a procedure to a function then every time add_two is used it will return a function. When using functions only by their name, we are using the function directly. So, in one hand we have something that returns a function, and in the other, something that is a function. How could this be useful? Well, lets say that we wanted to make an object system. My current plans for this language don't plan for that (even though I recognize its benefits) so the most we can do is approximate one. Associative arrays (or dictionaries in Python) will be how we approach this. We know that every time we instantiate a class into an object it needs to have its own state. I will start with a simple Object function.
/* Object from dictionary. Do not confuse this with real objects in OO languages! */
Object(d:dict={}) => do:
this = d
if !("construct" in this):
this["construct"] = () => null
;
return (...) => do:
apply(this["construct"], args, kargs)
return this
;
;
What can be said about the above function? It accepts a dictionary, if that dictionary doesn't have a "construct" key in it then it will create one pointing to an anonymous function. Finally, a function will be returned that accepts any number of arguments, and when that function is called those arguments will be applied to the "construct" function. The idea of "this" is faked by making the "this" variable equal to the dictionary that we are passing off as an object. Here is how we would use this function...
SomeClass = Object({
/* ... */
});
Unfortunately, there is a very subtle problem with SubClass and it is because of the nature of the scoping rules in Object. Every time SomeClass is "instantiated" it will return the same dictionary! More meaningfully, every instance of SomeClass will act on the same dictionary, meaning SomeClass is a singleton of sorts. This is clearly a big problem: instances of classes should have their own state but also have the same state structure as other instances of the same class. Here is where the power of procedures comes in.
Class(d) => (...) => apply(Object(d), args, kargs)
Singleton = Object
// How to use:
Stateful = Class({
/* ... */
})
Registry = Singleton({
/* ... */
})
// note: assuming the syntax allows it, the following should also work:
Class(d) => => Object(d)
There you have it. Using the Object function we now have two simple object types: classes and singletons! The singleton works because all uses of Singleton act on the same result of calling the Object function. The class works because it curries the Object function. Every time Class is called it acts as if Object was called directly: it accepts the same arguments and returns a function that accepts any number of arguments (the constructor). The key is that Object is called only when this mock constructor is called. At that point, a singleton class is built, instantiated, and returned all at once. Pretty amazing!
There are a few other things that I would like to say about the things I want to have working in this language, but that's just it--this language is still only an idea. This is a project that I am extremely interested in pursuing and hopefully I will be able to get something along the above lines working eventually.
Busy Getting Ready For the Frosh
So, I have been training for the last week and a half as a residence advisor at the university that I go to. I have been really busy and thus have not posted. Unfortunately I will be really busy for the next two weeks so don't expect anything from me in terms of blog posts.
12 Things You Should Dislike About PHP
In my last post I outlined my 15 favorite features of PHP that many people tend not to know about. To contrast my previous post I will detail several things I dislike most in PHP.
Note: the intention is not to start a flame war between PHP and any other language and I expect that the comments will reflect this.
1. Naming and Return Value Conventions
Naming Conventions
The first part of this--naming conventions--is obvious to anyone who has used PHP. PHP lacks a specific naming convention. This can easily be demonstrated with many of the string functions. For example, why is the function to find the first position of one string in another called strpos() when the function to repeat a string is called str_repeat()?
Another clear example of this is with PHP's built in classes. Why is the SQLite class called "SQLiteDatabase" whereas the MySQLi class is called "mysqli"? Anothing minor example with PHP's magic functions: __toString() and __set_state().
This might seem like excessive nitpicking; I think it is a serious issue that affects expectations. When programming, productivity is increased when ones expectations about a function or class name are fulfilled given the prior experience of using similar functions or classes.
Return Value Conventions
This was something that was annoying me the other day: parse_str(). The parse string function takes in a URL encoded string and then extracts the variables within into the current scope. As an option, it can take a reference to an array and put the variables into that by indexing them associatively.
Why would this function ever extract these variables? In fact why does it even take an array by reference? I knew of this function but hadn't read the documentation because I assumed it worked in a similar way to parse_url(). It doesn't. It doesn't even return an array.
This might seem like more of a rant than anything, but then lets get back to expectations. Consider that GET variables can have periods, spaces, hyphens, and more in them. PHP doesn't support any of these characters in their variable names so how is the function acting like extract() useful for these cases? Wouldn't it be more intuitive if it acted more like the similarly named parse_url()?
2. Functions
PHP's functions are not objects. More to the point, PHP has no first class functions. Functions cannot be passed by reference and they cannot be self-calling (this is different than a recursive function, which calls itself from within, whereas self-calling is from the outside).
Functions also don't seem to obey their scope in a predictable way. For example, functions can be nested indefinitely in PHP but that doesn't mean that they act like sub functions! Instead it means that any sub functions become accessible in the global scope once the parent function is called. This does have a practical application where different implementations of functions is needed for different PHP versions; however, that is what the if() statement is for. Simply put: nesting functions does not encapsulate them.
PHP has make-believe lambdas using create_function(). Once created, these functions are available in the global namespace and cannot be self-calling. For example: create_function(...)(); does not work because create_function returns the string name of the function it just created.
Consider the implication of returning a string function name from create_function() when it comes to meta-programming in PHP:
class TestLambda {
public $test;
public function __construct() {
$this->test = create_function('$str', 'echo $str;');
}
}
$l = new TestLambda;
// if functions were objects this would work; unfortunately that is not the case.
$l->test("hello world");
// instead this ugliness needs to be done:
call_user_func_array(array($l, $l->test), array("hello world"));
3. Method chaining doesn't work as expected
In my previous post I championed the use of method chaining--not necessarily with the best example--but showed how to do it nonetheless. Unfortunately there is a critical element of method chaining missing from PHP. It doesn't affect the methods that return $this but the constructor and class instantiation itself. The following code is from my previous post...
class Foo {
public function bar() {
return $this;
}
public function baz() {
return $this;
}
}
As is obvious, bar() and baz() can be chained indefinitely. PHP can't do the following even though a class constructor implicitly returns the instance of an object.
(new Foo)->bar()->baz()->...
4. Magic function __toString() doesn't work as expected
Note: This is a solved bug on php 5.2; however, I am stuck using php 5.1 on MAMP and this is an annoyance for me.
The __toString() method provides a nifty alternative to the cryptic "Object id #1". One would expect that coercing an instance of an object into a string would thus call __toString()--it doesn't.
class Foo {
public function __toString() {
return "instance of class Foo";
}
}
$foo = new Foo;
echo $foo; // output: instance of class Foo
$str = (string)$foo;
echo $str; // output: Object id #1 WTF!?
5. Static Methods and Scope Resolution (i.e. No Late Static Binding)
This was a problem that bit the Zend team in the ass early on in their development of the ORM in the Zend Framework. The problem happens when there is a static public method in a parent class that is called by an extending class. One would expect that the function would know it is being called from within the child class (as is the case with instance methods); however, that is not the case. Consider the following example:
class StaticTest {
static public function moo() {
echo __CLASS__; // also redundant given the get_class() function, but whatever.
echo get_class();
}
}
class ChildStaticTest extends StaticTest {
// ...
}
ChildStaticTest::moo();
// output: StaticTest StaticTest < -- wrong class names!
As a bug in PHP this has many annoying implications. It means that--for example--in the Zend Framework they could not do ModelName::find() for quick access to model functions without a monstrous hack involving debug_backtrace(). They ended up changing their finder syntax.
6. __autoload()
I mentioned this function as being a useful in my previous post. I also said not to use it and instead use spl_autoload_register(). To reiterate the point: __autoload() belongs to the global namespace. This means if two or more scripts share files and define an __autoload() function of their own then there will be a huge problem. This problem also sheds light on the fact that PHP does not support redefining; however, that is a limitation and not a problem of PHP.
7. Magic Quotes and Register Globals
To be honest I don't even want to talk about these. It's been rehashed so many times that I think most PHPers generally get the point.
8. Cleaning up after Exceptions
PHP5 introduces Exceptions. This is one of my favorite improvements present in PHP5. Unfortunately there is one gotcha that I hacked a solution for in a previous post. Exceptions can be thrown and caught. As with all classes in PHP5, destructors are generally called when an object leaves scope. Unfortunately, an uncaught exception will never be destroyed! Why is this an issue? PHP has no finally clause for exceptions! The following is not possible in PHP:
try {
throw new Exception;
} catch(Exception $e) {
echo 'caught';
} finally { // whoops, parse error here, missing feature
echo 'clean up';
}
That means any dependable clean up / tear down action must happen in the exception destructor, which won't be called unless the exception is caught! (Note: a hack around this is to explicitly call the __destruct() method of an exception from within its __toString() method.)
Note: if you didn't catch it above, PHP DOES NOT have a finally statement. This point is to show that because of this, destructors are the only way to clean up after an exception and even they are not 100% dependable.
9. Return values in write context
This is a huge WTF. There might be some logic behind it but for everyday programming it can be incredibly annoying. The following code will cause a fatal error:
function return_empty_string() {
return '';
}
empty(return_empty_string()); // PHP Fatal error: Can't use function return value in write context
10. Safe Mode
No. Just No. Safe mode is not safe. Please read the talks by Ilia Alshanetsky. Every hosting company should stop using PHP in safe mode and simply use open_basedir.
11. func_get_args()
This one has always been confusing to me and I will show why with some example code. PHP's func_get_args() cannot be passed as an argument to a function because of scoping issues. Consider that the following in PHP is possible:
strtoupper($string = "hello world"); echo $string; // output: hello world
Notice that $string is available in this scope and hasn't been affected by strtoupper() in any way. Unfortunately the following is not possible:
function test() {
return implode("", func_get_args()); // func_get_args can't be passed as an argument
}
test("hello", "world");
12. Return values again
PHP acts inconsistently with return values of functions. If a function were to return a string then any string functions could easily be applied to it. If a function returns an object then any of the objects instance methods can be called on it. However, if a function returns an array, then array item syntax (using square brackets) cannot be applied directly to it!
function test_with_array() {
return array('a', 'b', 'c');
}
function test_with_array_object() {
return new ArrayObject(array('a', 'b', 'c'));
}
// both fail with an unexpected parse error
echo test_with_array()[0];
echo test_with_array_object()[0];
// works
echo test_with_array_object()->offsetGet(0); // output: a
Although the tone of this post was negative it must be taken with a grain of salt. As a language PHP is constantly improving and I'm eagerly looking forward to PHP6. In its current state I think PHP is an amazing language--despite the above flaws. One might say its biggest flaw is that its too easy to learn. That argument puts the blame in the wrong hands. Many PHP programmers are unaware of security issues and best practices and as a result the language as a whole is tarnished.
This post was meant to contrast my previous one. Looking at what I think are PHP's unknown heroes and what I see as its flaws is not an accurate way of weighting strengths against weaknesses. There is much more to PHP (such as its libraries, community, etc) than what's on these two lists and I encourage people to discover those things for themselves.
15 Cool Things About PHP That Most People Overlook
Note: I have followed this post up with 12 Things You Should Dislike About PHP.
I saw this article linked to from Digg and was disappointed by the list. Here's what I think are some of the more interesting features of PHP.
1. Reflection API
PHP5's Reflection API can introspect PHP functions and classes and even allow them to be rebuilt (see this demo)! For example, PHP provides a call_user_func[_array]() function to call functions/methods given a string representation of their name; however, PHP doesn't provide such a function for classes.
class TestCallFunc {
public function __construct($a, $b) {
echo $a .' '. $b;
}
}
//call_user_func_array(array('TestCallFunc', '__construct'), array('hello', 'sir')); // will not work
$class = new ReflectionClass('TestCallFunc');
$class->newInstanceArgs(array('hello', 'sir')); // works
// the above can come in handy for certain cases where you want to unpack an array of values,
// without knowing how many elements are in that array to a classes constructor.
// If one knows the number of arguments of a constructor, or the constructor does not take a
// variable number of arguments, then this will suffice:
$class = 'TestCallFunc';
new $class('hello', 'sir');
Unfortunately the above example hardly does the Reflection API justice. Consider the following use case. Since the Reflection API can figure out almost everything about PHP code it means that it could be used as a strategy for dependency injection. By using smart type hinting within function definitions (see below) and the reflection classes, function dependencies could be figured out on the fly and be satisfied automatically.
2. Ticks
PHP has an unusual statement called declare. At the moment the statement only supports ticks (functions that are executed after every nth process/statement. The implication of tick handlers for timing and testing applications is huge. Imagine being able to chart how long specific processes take as they happen without the burden of having to manually place timing functions within the code! Another possibility would be to pass state to a tick function for analysis. In some cases this would be immensely useful for tracking down bugs or for simply seeing how data is changed throughout the different processes within a program.
3. list(), extract(), and compact()
list() is a function that PHPers learn early and forget about almost as quickly. list() allows for easy unpacking of numeric arrays, eg: array('a', 'b', 'c', ...). List can be used cunningly for simple tasks such as switching the values of two variables:
$a = 'a'; $b = 'b'; list($a, $b) = array($b, $a); echo $a; // output: b echo $b; // output: a
List also has a practical application for any function that returns an array. Instead of having to deal with the results as $result[0], $result[1], etc one can use the list() construct to set the values of the array to variables without the hassle of having to do each variable setting manually. Essentially, list() is PHP's way of doing multiple variable assignment.
Unpacking associative arrays is also easily accomplished with extract(). Consider the following simple example:
extract(array('a' => 'a', 'b' => 'b'));
echo $a; // a
echo $b; // b
extract() comes in handy when transferring scope variables around from one place to another is needed. By packing scope variables into an associative array using get_defined_vars() or compact() and then unpacking them somewhere else using extract(), scope variables can be transparently transported. This experiment of mine examples this use of extract().
Another use for extract() is for template engines. Many simple template engines allow their variables to be set in one place using an associative array and accessed as normal variables within the templates. This works simply by extracting that associative array from within the templates.
A special note should is given to compact(), which was mentioned in the comments by Tarique Sani and does the opposite of extract(). It acts in a similar way to get_defined_vars(); however, it gives the programmer more control by allowing them to specify which local variables to pack into an associative array.
4. PHP5 SPL
The PHP5 Standard PHP Library is full of gems. Two of my favorites include the Iterator and IteratorAggregate interfaces.
PHP5's Iterators work transparently with the foreach() construct as if they were arrays. Iterators come in handy when--for example--formatting needs to be applied to a set of data. Be it a MySQL result set or a multidimensional array, Iterators allow for the accessing, formatting it, and return of data all at the same time! The implication is that data is only fetched when it is needed and that any intermediate steps between retrieval and return of data happen transparently.
Sometimes making a class implement Iterator to allow a class to be iterated is bulky. This problem is solved by implementing the IteratorAggregate interface. It works is by defining a getIterator() function within a class that returns an iterator. If you're a Python user, getIterator() is tantamount __iter__().
Two other nifty interfaces in the SPL are ArrayAccess and Countable. Consider the following code:
class Foo implements ArrayAccess, Countable {
private $vars = array();
// ArrayAccess
public function offsetGet($key) {
$ret = NULL;
if($this->offsetExists($key))
$ret = $this->vars[$key];
return $ret;
}
public function offsetSet($key, $val) {
$this->vars[$key] = $val;
}
public function offsetUnset($key) {
unset($this->vars[$key]);
}
public function offsetExists($key) {
return isset($this->vars[$key]);
}
// Countable
public function count() {
return count($this->vars);
}
}
$foo = new Foo;
$foo['bar'] = 'baz'; // calls Foo::offsetSet()
echo $foo['bar']; // output: baz, calls Foo::offsetGet()
echo count($foo); // output: 1, calls Foo::count()
As is clear, the Foo class acts in many respects like an array. This functionality is desirable when the convenience of array-like access to a class and the power of class methods are used to enhance productivity and functionality. (As a side note, check out ArrayObject.)
When do any of the SPL classes come in handy? Consider the following object-oriented interface to a MySQL database: iterators are used to fetch the results only when they are needed; Countable allows calling of the native count() function on that query result instead of custom and unintuitive numRows() function; and ArrayAccess is used to represent each row from the database where row objects act like arrays for convenience but can implement more advanced business logic through extension.
The SPL classes allow programmers to interact with their classes more natively and intuitively than ever before. Implementing the different interfaces and classes from the SPL into code allows for transparent interaction with different data types in the same way.
5. __autoload()
__autoload is called transparently by PHP when a class that doesn't exist is instantiated. Autoload will take in a class name and then go and try and find the file that the class is in, include it, and let the program continue running its course.
Although this is a handy function I suggest that it not be implemented directly. The reason why I say don't use __autoload() is because it has no namespace. Simply put, if two (or more) applications are running at the same time and share common files that both happen to define the __autoload() function then there will be errors. Instead, please check out spl_autoload() and spl_autoload_register().
6. Type Hinting
Type hinting is used to make sure that the information being passed into specific functions complies with an expected data type. Unfortunately type hinting only works for objects and arrays, but it's still extremely nifty! Note: as the following example shows, PHP will error when an unexpected data format is passed to the function. The following is an example of type hinting in action for objects and for arrays:
function test(array $a, stdclass $b) {
echo 'yay';
}
test(array(), new stdclass); // works
test('a', 'b'); // results in an error
7. Abstract Classes and Iterfaces
When classes should follow specific contracts interfaces and abstract classes should be used. Interfaces define the required functions and function arguments for all implementing classes. Abstract classes are similar in that respect; however, they also allow for functions to be regularly defined (with bodies).
Note: interfaces nor abstract classes can be instantiated directly. Interfaces need to be implemented (PHP supports multiple interfaces for classes) and abstract classes must be extended.
Obvious use cases for interfaces and abstract classes are for abstraction layers where its useful to define a single interface in one place and let extending classes implement the data-specific logic on their own. Simple examples to this would the database abstraction layers, such as ADOdb.
8. "static" keyword
This is more for php4 people and is really a hack. The most common use of the static keyword from within a function is to implement singleton factories for classes. With improvements in PHPs object-oriented model in PHP5, the static keyword is not as useful because classes allow for static methods and proper function visibilities.
function increment() {
static $count;
if($count === NULL) {
$count = 0;
}
return ++$count;
}
echo increment(); // 1
echo increment(); // 2
echo increment(); // 3
// ...
9. === and !==
These are extremely useful and most PHPers can't tell the difference between == and ===. Essentially, == looks for equality, and by that PHP will generally try to coerce data into similar formats, eg: 1 == '1' (true), whereas === looks for identity: 1 === '1' (false). The usefulness of these operators should be immediately recognized for common functions such as strpos(). Since zero in PHP is analogous to FALSE it means that without this operator there would be no way to tell from the result of strpos() if something is at the beginning of a string or if strpos() failed to find anything. Obviously this has many applications elsewhere where returning zero is not equivalent to FALSE.
10. Variable Assignment from within Conditional Statements
This is another fun trick that can also be a pain in the ass. Mistakingly doing if($result = '1') instead of if($result == '1') often happens and is a difficult bug to trace. Interestingly, this annoyance can also come in handy! Try this: check if a strpos() failed or not and get the proper result of it all from within an if() statement. If you don't know how it would work then take a look at the following code:
if(FALSE !== ($pos = strpos('aaaa', 'a'))) {
echo $pos; // echo's 0
}
11. PHP's Magic Functions
They definitely don't compare to what Python offers but they can be pretty nifty. Couple these functions with the SPL and some pretty cool classes can be made.
__call()
The __call() magic function allows for a class to call functions that it doesn't necessarily have. For example, if there are a lot of redundant functions in a class that work in very similar ways but still require a slight amount of custom logic then maybe __call() would allow for a better implementation. Consider an associative array where the keys represent a white list of functions that __call should work with and their values would somehow allow __call to implement the slight differences between the functions.
If that example seems unlikely then let me detail two ways that I've used __call.
The first way was implementing filters for different MySQL data types. Data for VARCHAR, INT, CHAR, TEXT, et al. can all be handled in very similar ways but require slightly different type casts and regular expressions checks. Well, storing the unique checks in an array with function names for each data type dramatically reduced the amount of redundant code I had to write and made the system more maintainable for when I needed to add more data types to the list.
The second way that I implemented __call() was for credit card processing APIs. I recently made a library to interact with the Authorize.net API and it was expected of me that in the event that the processor was changed that the only thing that would need to be changed is the call to instantiate the appropriate class. If anyone has worked with payment processors before then they will know that the data they accept can be crytpic (eg: x_login, x_customer_cc). To make things simple, I used call to map an associative array of keys (function names) to the processor-specific variables that they modified. Instead of setting an x_login field, I could do: $AIM->login().
__get(), __set(), __isset(), __unset()
These act similarly to ArrayAccesses offsetGet, offsetSet, etc etc, except that these functions allow you to access non-existent instance variables.
__toString()
What happens when you try to echo a class instance? Normally the output will be the unhelpful Object id #1 (or another number). This can be changed though! Changing what is returned from an echo is as simple as defining a custom __toString() method and giving it a custom return value.
12. __halt_compiler(): Halt the Compiler!
Ever wanted to make the installer for a program and the entire program to be installed fit into one file? This can actually be done by using __halt_compiler()! Check it out and make sure to look up how applications such as Phar and FUDForum take advantage of it.
I was asked how this was different from exit() so I will explain. Imagine having a PHP installer script at the top of a file and then having a compressed version of, for example, a forum (in the case of FUDForum) in the rest of the file. When __halt_compiler() is called, PHP literally stops parsing and compiling past that point. That means that the installer can open itself up, look for everything after the __halt_compiler() call, unzip it, and then store it on the server.
13. Variable Composition
Just see this experiment for more interesting implementations, but otherwise the following is possible:
${'a' . 'b'} = 'c';
echo $ab; // c
14. Chaining Method Calls
If you're a fan of jQuery then you will love Javascript's ability to chain methods. PHP5 can also do this! For example: (take note of return $this)
class Foo {
public function bar() {
return $this;
}
public function baz() {
return $this;
}
}
$foo = new Foo;
$foo->bar()->baz()->bar()->.... // these two functions can continue to be chained
The above application doesn't show any practical use for chaining so I encourage people to look at how the CodeIgniter PHP framework uses method chaining for database query building.
15. preg_split
Have you ever use explode() and then end up looping through the results and picking out the empty ones? Isn't that annoying? By using preg_split this tedious process can be avoided:
// instead of...
$str = "a,b,,c,d"
explode(",", $str); // array('a','b','','c','d')
// do...
preg_split("~,~", $str, -1, PREG_SPLIT_NO_EMPTY);
// return: array('a','b','c','d')
The above example might not seem all that useful so consider bb code parsers. One of the major problems with parsing bb code is that most of the bbcode to html translators don't fix improper closing tags, or even check for them! Generally speaking, most bb code parsers will to a flat preg_replace for each of the tags.
A better approach to parsing bb code is to split the text up into parts that are text and parts that are bbcode-tag-like, eg: ~[([^]]*)]~. By doing this one can go through the split up results and parse the bbcode using a stack. This allows for both proper checking of bbcode opening and closing tags along with the ability to implement a generalized solution so that adding bbcodes isn't about making a new preg_replace but rather making a class/function that knows how to correctly handle the opening and closing tags.
In one of my previous posts I use the above mentioned technique to parse HTML for special XML-like template tags.
Download PyrofORM
If you're interested in seeing the inner workings of PyrofORM then you can download it here.
