DevSteve: PHP and XSL Part 2

This is the second entry in a three part series on the advantages of using XML and XSL to generate a web page. Part one will focus on the downsides of echoing HTML through PHP and how an XML/XSL driven system overcomes these shortcomings. Part two will focus on how PHP uses DOMDocument and the XSLTProcessor to generate XML documents and convert the XML into XHTML with an XSL stylesheet. Part three will introduce the DOMi object, a purpose built class that is designed to simplify and speed up the process of building XSL driven websites.

Part 2 - Encouraging DOMDocument

In part one of this series, I covered several shortcomings of the traditional system of using the PHP echo function to display page contents to the user. I highlighted three main areas:

Code cleanliness - issues with inconsistent whitespacing, faulty syntax highlighting, and the need to rectify two different code styles or risk losing proper indentation.
Data organization - PHP has no clean methods for easily navigating complex data structures, and thus issues arise when trying to display the contents of complex data structures
Non headless system - by echoing the display throughout the execution of the script, it is difficult to fully separate the data from the display, which reduces flexibility of design.

All three of these issues are solved in one fell swoop by switching over to an XSL driven system, and this entry in this series is going to explain how to use DOMDocument and the XSLTProcessor to do this.

What is DOMDocument?

DOMDocument is an object built into PHP that is used to manipulate a document that is built under the Document Object Model. The Document Object Model is a syntax and markup language used primarly for web based applications. Anyone familiar with HTML or XML is already familiar with the structure of DOM. Anyone not familiar with the structure of DOM should go read up on it before proceeding, as I will assume you are already familiar with DOM.

For someone with knowledge of DOM, using DOMDocument will prove to be easy. Just as a DOM structure contains a document, nodes, attributes and text, so does a DOMDocument contain a DOMDocument, DOMNode, DOMAttr and DOMText. The following code snippet is a very basic DOMDocument being built with a single root node that contains 3 child nodes.

<?php

$Dom = new DOMDocument('1.0', 'UTF-8');

$Root = $Dom->createElement('root');
$Dom->appendChild($Root);

$Root->appendChild($Dom->createElement('child', 'first'));
$Root->appendChild($Dom->createElement('child', 'second'));
$Root->appendChild($Dom->createElement('child', 'third'));

?>

This will produce the following output...

<?xml version="1.0" encoding="UTF-8" ?>
<root>
<child>first</child>
<child>second</child>
<child>third</child>
</root>

As you can see, DOMDocument usage is not very difficult at all. DOMDocument::createElement is used to create a DOMElement, and DOMDocument, DOMElement and DOMNode support the member method appendChild, which attaches a provided DOMElement to its new parent. However, this series of articles isn't meant to teach how to use DOMDocument, so I'll leave that up to you to learn more than what I've explained here.

What is the XSLTProcessor?

The XSLTProcessor is an object built into PHP that uses XSL, Xpath and XSLT to convert an XSL stylesheet and an XML tree into XHTML. The object itself is very simple, as it just combines the XSL stylesheet and the XML tree. The more complex, and more powerful, part of the equation lies in the XSL stylesheet itself.

If we were to add the following lines to the previous sample, we would load an XSL stylesheet and output the results of the XSLTProcessor's conversion.

<?php

$Xsl = new XSLTProcessor();

$Stylesheet = new DOMDocument();
$Stylesheet->load('stylesheet.xsl');

$Xsl->importStylesheet($Stylesheet);

echo $Xsl->transformToXml($Dom);

?>

As you can see, the XSLTProcessor needs to be given a DOMDocument stylesheet, and the transformToXml method receives the xml tree as a DOMDocument and returns an XHTML output. For basic work, it really is that simple. The complex part is the new part - the XSL stylesheet.

What is an XSL stylesheet?

An XSL stylesheet is an XML document that is formatted with special xsl nodes. These nodes, such as value-of, template, call-template, for-each, variable, param, etc., are used to dictate commands to an XSLTProcessor.

Here is an example XSL stylesheet that uses the previously created DOM tree to display a list of the child nodes.

<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet xmlns:xsl="">
<xsl:template match="/">
<ul>
<xsl:for-each select="root/child">
<li>
<xsl:value-of select=".">
</li>
</xsl:for-each>
</ul>
</xsl:template>
</xsl:stylesheet>

If we break down the document and analyze each node, we can easily see what it is doing.

xsl:stylesheet is the root node to encapsulate the entire stylesheet.

xsl:template is comparable to a PHP function. it is a discreet code snippet that can be executed individually. This template is given the match attribute of "/". Templates can either be given a name attribute or a match attribute. Naming a template allows it to be called at will, and matching a template allows it to be called when the XML tree contains the specified node. In this case, the specified node is "/", which means any root level node. In other words, this template will be called in any situation, and is essentially our starting point for the display.

Next, we put up a simple HTML ul node.

Within the ul node, we use xsl:for-each and a provided xpath to set up a for-each loop. The xpath that was provided is "root/child". Without going outside the scope of this article and explaining xpath, I'll just say that selects all of the child nodes that we created in our earlier document. An xpath is nothing more than a ruleset that is used to match one or more nodes. These rules can be customized in incredible ways to identify a nodelist, and then the contents of the for-each node will reflect each item in the nodelist. This overcomes PHP's foreach shortcoming with only scanning one level of one array at a time, and breaks us from the need to contain derivative or redundant data.

The final xsl node listed here is the xsl:value-of node, which is used similar to PHP's echo. It will take the specified value, which in this case is simply ".", meaning the current node's value, and display it on screen. In this case, it will created an li node and put the value of the node within that li node.

In the end, the XML tree and the XSL stylesheet will combine in the XSLTProcessor to create the following HTML

<ul>
<li>first</li>
<li>second</li>
<li>third</li>
</ul>

The formatting of XSL is identical to HTML (as both use the DOM structure), and thus blends in to create attractive, clean code. Since XSL stylesheets are so commonly used with HTML, almost any IDE that supports XSL will blend the syntax highlighting to keep it nicely viewable. Thus, the first problem with HTML echo is eliminated.

Due to the flexibility of Xpath, complex data structures are navigated with ease, allowing a single xsl:for-each node to loop across data anywhere in the structure, regardless of nesting or location. When you can easily scan and interact with data anywhere in the structure, during the display phase, you no longer have any need to store redundant data in the XML. If you want to derive data from within the tree, you can do it immediately with a single call. Thus, the second problem with HTML echo is eliminated.

Since the display is dictated by an XSL stylesheet that can be swapped out on the fly, your data and display are now firmly separated. The data tree is pure, unencumbered by redundant data, and not forced into any particular look or order. You now have a headless system, and the third and final problem with HTML echo is eliminated.

What I have demonstrated here is just a taste of the power of an XSL driven system. Add in more templates, XSLT functions and a little creativity, and you can turn an XML data tree into any kind of output that you want. In addition to being headless, the data tree is so easily navigated through Xpath, you never, ever need to put in derivative data. If any piece of data can be obtained by looking at the rest of the data, then it is not needed for an XSL driven system.

The final part of this three piece set will focus on the DOMi object. This open source tool is built specifically for XSL driven websites, and contains tools to rapidly transform PHP data types, such as arrays, into XML data trees, while keeping the structure perfectly intact. In addition, it blends in XSLTProcessor to allow quick rendering without the need for intricate knowledge of XSLTProcessor's commands.

DevSteve

Wednesday, August 13, 2008

PHP and XSL Part 2

No comments:

About Me

Techdirt