Practical XSLT Examples: Transforming an XML Document to XHTML

Having been involved with only one significant XSLT project using PHP (the PennMed Clinical Trials project), I don’t consider myself an expert. I did run into some issues, however, that required me to go beyond what was available in the online tutorials I found, and to dig into discussion forums, as well as figure out some things on my own. I’ll share some of those experiences here, with practical examples of transforming an XML document to XHTML. This is not a general introduction or tutorial. For that, I recommend the w3schools.com XSLT Tutorial.

  1. Should you perform the transformation on the client side or server side? Unless you have some special reason not to, I recommend transforming on the server side. Why make yourself deal with possible cross-browser compatibility issues in your XSL code, when you can instead have the server do the transformation, and send the browser nice, tidy XHTML instead?
  2. Using PHP’s XSLTProcessor: the PHP portion of the transformation is straightforward. The 7 lines of code in the example on the php.net site is very similar to the code I used in my application.
  3. The XML file: here’s a sample XML file from my project. It’s a document describing a clinical trial. My examples below will come from this document.
  4. Your XSL stylesheet’s outermost template match: everything I read said the outermost template match in your XSL file should be:
    <xsl:template match="/">

    which indicates the root of the document tree, therefore giving you access to all the document’s content. I disagree with this recommendation, at least as far as my project goes. The clinical trials XML documents have all their content contained in a single “clinical_study” tag. Therefore my outermost template match is:

    <xsl:template match="clinical_study">

    This way, I don’t have to repeat “clinical_study/” in every child XSL tag.

  5. Tags that appear only once: it’s vital to fully understand the XML documents you’re processing, so you know which tags might appear multiple times, and whether they have child tags. Tags that appear only once are the easiest to process. Here’s an example of how to display the value of such a tag; this is from a list of eligibility criteria for a clinical trial:
    <li>Gender: <xsl:value-of select="eligibility/gender"/></li>
  6. Tags that appear multiple times, without children: A clinical trial can address one or more medical conditions. They are listed in the XML like this:
    <condition>Metastatic Anaplastic Thyroid Cancer</condition>
    <condition>Metastatic Differentiated Thyroid Cancer</condition>

    Looping through them requires applying a separate xsl template tag. At the point in the XSL stylesheet where we want the conditions to be displayed, we apply the template like this:

    <ul>
    <xsl:apply-templates select="condition"/>
    </ul>

    Then near the end of the XSL stylsheet, after we close the main “clinical_study” template, we define this template:

    <xsl:template match="condition">
        <li><xsl:value-of select="."/></li>
    </xsl:template>

    The “.” indicates that we want to select the value of the tag itself (analagous to a “.” when listing the contents of a directory, which refers to the directory itself).

  7. Tags that appear multiple times, with children: the clinical trials XML documents can have one or more “location” tags (the example here happens to have only one). In our transformation, we want to display the contact information for the studies where the location is the University of Pennsylvania or the Children’s Hospital of Pennsylvania. As before, we indicate the template tag to apply, but this time with a conditional test which I’ll explain below:
    <xsl:if test="location/facility[contains(name,$upenn) or contains(name,$chop)]">
        <h3>Local Contact</h3>
        <xsl:apply-templates select="location" mode="contact"/>
    </xsl:if>

    …And the template:

    <xsl:template match="location" mode="contact">
        <xsl:if test="contains(facility/name, $upenn) or contains(facility/name, $chop)">
            <p>
            <xsl:choose>
              <xsl:when test="contact/last_name">
                <xsl:value-of select="contact/last_name"/>
                <xsl:if test="contact/phone">, <xsl:value-of select="contact/phone"/></xsl:if>
                <xsl:if test="contact/phone_ext"><xsl:text> </xsl:text>x<xsl:value-of select="contact/phone_ext"/></xsl:if>
                <xsl:if test="contact/email">, <a href="mailto:{contact/email}"><xsl:value-of select="contact/email"/></a></xsl:if>
              </xsl:when>
              <xsl:otherwise>
                A local contact person has not been assigned yet.
              </xsl:otherwise>
            </xsl:choose>
            <br />
            <xsl:value-of select="facility/name"/><br />
            <xsl:value-of select="facility/address/city"/>, <xsl:value-of select="facility/address/state"/><xsl:text> </xsl:text><xsl:value-of select="facility/address/zip"/><br />
            </p>
        </xsl:if>
    </xsl:template>

    There’s a lot going on here…

  8. Variable scope: You can define your own variables in XSL:
    <xsl:variable name="chop">Children's Hospital of Philadelphia</xsl:variable>

    It’s important to note that they are scoped tightly. If you define or alter the value of a variable within a loop, that value will be gone when the loop ends. In this case I defined my variables near the top of the document, before the “clinical_study” template, so they are available for use in any template in the stylesheet.

  9. Testing for a condition in multiple tags: The use of XPath predicates allows us to search through all of the “location” tags in the XML document. Note that this:
    <xsl:if test="location/facility[contains(name,$upenn) or contains(name,$chop)]">

    is not equivalent to:

    <xsl:if test="contains(location/facility/name,$upenn) or contains(location/facility/name,$chop)">

    The former searches all the “location” tags in the document for Penn or CHOP, and we’re using it to determine whether we should show the “Local Contact” section. We use code similar to the latter within the “location” template, as we check each location (if we tried to use it in the main clinical_study template, it would check only the first “location” tag in the document).

  10. The template “mode” attribute: in my XSL I need to loop through the “location” tags more than one time, and for more than one purpose. I loop through them once to get contact information, which is what this template is for. I loop through them again later in the stylesheet to extract information on the Investigators leading the trials. For that I have a different “location” template with mode=”investigator”.
  11. Handling quotes: the reason I defined a variable for CHOP instead of running the “contains” test on a plain string is that the XSL processor will throw an error on the apostrophe in “Children’s”. Unlike XHTML, it’s not valid syntax to put single quotes within a double quote delimited string.
  12. Referencing XML values within an XHTML tag: To get the contact person’s email address in a “mailto” link, we delimit the value in curly braces – <a href="mailto:{contact/email}">. The curly braces extract the value of the tag.
  13. Adding spaces: ther XSL parser aggressively strips spaces. It will honor spaces between words in your stylesheet, but it will strip spaces between tags. To force a space, use <xsl:text> </xsl:text> (this one took a while for me to track down, as discussion forum posts I found on this topic focused on using entities as the solution, but that is not an elegant approach, as &nbsp; is not a native XML entity, and this wouldn’t be a semantically correct use of it anyway). The parser doesn’t do this just to be annoying. If you were using it to create, for example, a PDF document, you would be glad it aggressively strips spaces, as stray spaces could cause major headaches in that context.

3 Comments

  1. Reply
    Pat W March 30, 2009

    I have no idea how this will display.

    To add values to tag attributes you can also use the tag which is very handy if you need conditionally add attributes.

    It’s more verbose for simple selects like this, but it pays off when you need to any testing to the attribute value.

    Also if you ever have to do XSL on the client, Sarissa makes it nearly painless, and client side XSL is really fast and great for certain scenarios. http://dev.abiss.gr/sarissa/ I used to build a form building app at UVa that essentially created an XML representation of the form all on the client it was cool to use the same XSL form the client and the server depending on if you were editing or just using the form.

  2. Reply
    Pat W March 30, 2009

    Yay all code examples stripped! Anyway that was talking about the [xsl:attribute] tag. The example was:

    [xsl:tag name=”a”]
    [xsl:attribute name=”href”]
    [xsl:value-of select=”contact/email” /]
    [/xsl:attribute]
    [/xsl:tag]

  3. Reply
    Francois Bernard July 10, 2010

    Hi, as for XML – XSLT – XHTML transformations, I have found a problem when there is some HTML code contained in the XML file. I’m not 100% sure about that – any HTML(XML) content must be “escaped” in an XML file? An example: On http://refsbook.com a user exports his profile in XML format. But because e.g. the user’s “about” info is HTML-encoded, we need to put it into the XML content.

Leave a Reply