W3C

XSLT Requirements
Version 2.0

W3C Working Draft 14 February 2001

This version:
http://www.w3.org/TR/2001/WD-xslt20req-20010214
(available in XML or HTML)
Latest version:
http://www.w3.org/TR/xslt20req
Editors:
Steve Muench (Oracle) <[email protected]>
Mark Scardina (Oracle) <[email protected]>

Abstract

This document describes the requirements for the XSLT 2.0 specification.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C. This document is the first public XSLT 2.0 Requirements working draft.

This is a W3C Working Draft for review by W3C Members and other interested parties. It is a draft document and may be updated, replaced or made obsolete by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". This is work in progress and does not imply endorsement by the W3C membership.

This document has been produced as part of the W3C Style activity, following the procedures set out for the W3C Process. The document has been written by the XSL Working Group (W3C members only). The goals of the XSL Working Group are discussed in the XSL Working Group charter. The XSL Working Group feels that the contents of this Working Draft are relatively stable, and therefore encourages feedback on this version.

Comments on this document should be sent to the W3C mailing list [email protected] (archived at http://lists.w3.org/Archives/Public/xsl-editors/). A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.

Table of contents

1 Goals
2 Requirements
3 References

Appendices


1 Goals

XSLT 2.0 has the following goals:

In addition, the following are explicitly not goals:

2 Requirements

1 Must Support the XML "Family" of Standards

As part of the evolving family of XML standards, XSLT 2.0 MUST support the W3C XML architecture by integrating well with other standards in the family.

         1.1 Must Maintain Backwards Compatibility with XSLT 1.1

Any stylesheet whose behavior is fully defined in XSLT 1.1 and which generates no errors will produce the same result tree under XSLT 2.0

         1.2 Must Match Elements with Null Values

A stylesheet SHOULD be able to match elements and attributes whose value is explicitly null.

Ed. Note: Just matching @xsi:null="true" would find elements with this attribute even if the element actually had content like:

<foo xsi:null="true">SomeValue</foo>

or used the xsi:null when the element did not allow its content to be nullable, both of which are invalid.
         1.3 Should Allow Included Documents to "Encapsulate" Local Stylesheets

XSLT 2.0 SHOULD define a mechanism to allow the templates in a stylesheet associated with a secondary source document, to be imported and used to format the included fragment, taking precedence over any applicable templates in the current stylesheet.

Use Case

When a MATHML document is included in the current source document, that MATHML fragment could already contain its own <?xml-stylesheet?> indicating appropriate templates to properly style the Math.

         1.4 Could Support Accessing Infoset Items for XML Declaration

A stylesheet COULD be able to access information like the version and encoding from the XML declaration of a document.

Use Case

A stylesheet should be able to set the output encoding to use the same encoding as the input document.

         1.5 Could Provide QName Aware String Functions

Users manipulating documents (e.g. stylesheets, schemas) that have QName-valued element or attribute content need functions that take a string containing a QName as their argument, convert it to an expanded name using either the namespace declarations in scope at that point in the stylesheet, or the namespace declarations in scope for a specific source node, and return properties of the expanded name such as its namespace URI and local name.

         1.6 Could Enable Constructing a Namespace with Computed Name

Provide an <xsl:namespace> analog to <xsl:element> for constructing a namespace node with a computed prefix and URI.

         1.7 Could Simplify Resolving Prefix Conflicts in QName-Valued Attributes

XSLT 2.0 COULD simplify the renaming of conflicting namespace prefixes in result tree fragments, particularly for attributes declared in a schema as being QNames. Once the processor knows an attribute value is a QName, an XSLT processor should be able to rename prefixes and generate namespace declarations to preserve the semantics of that attribute value, just as it does for attribute names.

         1.8 Could Support XHTML Output Method

Complementing the existing output methods for html, xml, and text, an xhtml output method could be provided to simplify transformations which target XHTML output.

2 Must Improve Ease of Use

XSLT 2.0 MUST address frequently requested enhancements to make using XPath even more straightfoward for handling common use cases.

         2.1 Must Allow Matching on Default Namespace Without Explicit Prefix

Many users stumble trying to match an element with a default namespace. They expect to be able to do something like:

<xsl:stylesheet version="1.0"
         xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
         xmlns="urn:myuri">
  <!-- Expect this matches <foo> in default namespace -->
  <xsl:template match="foo">

thinking that leaving off the prefix from the foo element name, that it will match <foo> elements in the default namespace with the URI of urn:myuri. Instead, they are required to assign a non-null prefix name to their namespace URI and then match on "someprefix:foo" instead, which has proven to be far from obvious. XSLT 2.0 SHOULD provide an explicit way to handle this scenario to avoid further user confusion.

         2.2 Must Add Date Formatting Functions

One of the more frequent requests from XSLT 1.0 users is the ability to format date information with similar control to XSLT's format-number(). XML Schema introduces several kinds of date and time datatypes which will further increase the demand for date formatting during transformations. Functionality similar to that provided by java.text.SimpleDateFormat. A date analog of XSLT's named xsl:decimal-format may be required to handle locale-specific date formatting issues.

Use Case
  1. Given an XML element like: <Period start="2000-05-07" end="2000-05-13"/>

    Format it as: Invoice: 7 May 2000 - 13 May 2000

  2. Given the same element above, format it according to the current locale as:

    Fattura: 7 Maggio 2000 - 13 Maggio 2000

         2.3 Must Simplify Accessing Id's and Key's in Other Documents

Currently it is cumbersome to lookup nodes by id() or key() in documents other than the source document. Users must first use an xsl:for-each instruction, selecting the desired document() to make it the current node, then relative XPath expressions within the scope of the xsl:for-each can refer to id() or key() as desired.

         2.4 Should Provide Function to Absolutize Relative URIs

There SHOULD be a way in XSLT 2.0 to create an absolute URI. The functionality should allow passing a node-set and return a string value representing the absolute URI resolved with respect to the base URI of the current node.

         2.5 Should Include Unparsed Text from an External Resource

Frequently stylesheets must import text from external resources. Today users have to resort to extension functions to accomplish this because XSLT 1.0 only provides the document() function which, while useful, can only read external resources that are well-formed XML documents.

Use Case

Given an XML document like:

<section>
  <para>The code for the example looks like this:</para>
  <example>
    <external-file href="ParseXML.java"/>
  </example>
</section>

Format the section and include the source of the sample code from the external file in the output.

         2.6 Should Allow Authoring Extension Functions in XSLT

As part of the XSLT 1.1 work done on extension functions, a proposal to author XSLT extension functions in XSLT itself was deferred for reconsideration in XSLT 2.0. This would allow the functions in an extension namespace to be implemented in "pure" XSLT, without resulting to external programming languages.

         2.7 Should Output Character Entity References Instead of Numeric Character Entities

Users have frequently requested the ability to have the output of their transformation use (named) character references instead of the numeric character entity. The ability to control this preference as the level of the whole document is sufficient. For example, rather than seeing &#160; in the output, the user could request to see the equivalent &nbsp; instead.

         2.8 Should Construct Entity Reference by Name

Analogous to the ability to create elements and attributes, users have expressed a desire to construct named entity references.

Ed. Note: Does this require a change to the data model?

         2.9 Should Support for Unicode String Normalization

For reliable string comparison of Unicode strings, users need the ability to apply Unicode normalization before comparing the strings.

         2.10 Should Standardize Extension Element Language Bindings

XSLT 1.1 undertook the standarization of language bindings for XSLT extension functions. For XSLT 2.0, analogous bindings SHOULD be provided for extension elements.

         2.11 Could Improve Efficiency of Transformations on Large Documents

Many useful transformations take place on large documents consisting of thousands of repeating "sub-documents". Today transformations over these documents are impractical due to the need to have the entire source tree in memory. Enabling "progressive" transformations, where the processor is able to produce progressively more output as more input is received, is tantamount to avoiding the need for XSLT processors to have random access to the entire source document. This might be accomplished by:

  • Identifying a core subset of XPath that does not require random access to the source tree, or
  • Consider a "transform all subtrees" mode where the stylesheet says, "Apply the transformation implied by this stylesheet to each node that matches XXX, considered as the root of a separate tree, and copy all the results of these mini-transformations as separate subtrees on to the final result tree."
Use Case

Transforming an XML document representing the daily closing prices of NASDAQ stocks for 1999 like the example below (over 1.3 millon <ClosingQuote> sub-elements) to produce a comma-separated list of Ticker, Date, and Closing Price.

<YearOfNasdaqCloses Year="1999" TotalSecurities="5207">
  <ClosingQuote Ticker="AAABB">
    <Date>01/01/1999</Date>
    <Price>6.25</Price>
    <Percent>0.5</Percent>
  </ClosingQuote>
  <!-- 1,353,818 Additional Entries Removed -->
  <ClosingQuote Ticker="ZVXI">
    <Date>12/31/1999</Date>
    <Price>16.10</Price>
    <Percent>-1.05</Percent>
  </ClosingQuote>
</YearOfNasdaqCloses>

         2.12 Could Support for Reverse IDREF attributes

Given a particular value of an ID, produce a list of all elements that have an IDREF or IDREFS attribute which refers to this ID.

Ed. Note: This functionality can be accomplished using the current <xsl:key> and key() mechanism.

         2.13 Could Support for Case-Insensitive Comparisons

XSLT 2.0 could expand its comparison functionality to include support for case-insensitive string comparison.

         2.14 Could Support Lexigraphic String Comparisons

We don't let users compare strings like $x > 'a'.

Ed. Note: i18n issues.

         2.15 Could Allow Comparing Nodes Based on Document Order

Support the ability to test whether one node comes before another in document order.

Ed. Note: Need a Use Case for this.

         2.16 Could Improve Support for Unparsed Entities

In XSLT 1.0 there is an asymmetry in support for unparsed entities. They can be handled on input but not on output. In particular, there is no way to do an identity transformation that preserves them. At a minimum we need the ability to retrieve the Public ID of an unparsed entity.

         2.17 Could Allow Processing a Node with the "Next Best Matching" Template

In the construction of large stylesheets for complex documents, it is often necessary to construct templates that implement special behavior for a particular instance of an element, and then apply the normal styling for that element. Currently this is not possible because <xsl:apply-templates/> specifies that for any given node only a single template will be selected and instantiated.

Currently the processor determines a list of matching templates and then discards all but the one with the highest priority. In order to support this requirement, the processor would retain the list of matching templates sorted in priority order. A new instruction, for example <xsl:next-match/>, in a template would simply trigger the next template in the list of matching templates. This "next best match" recursion naturally bottoms out at the builtin template which can be seen as the lowest priority matching template for every match pattern.

Use Case

Consider a large, complex stylesheet for a particular document type. In order to support a new application, the schema designer for that document type adds a new global attribute, that is an attribute allowed on every element in the schema. For example, consider the addition of a global attribute named diff for marking changes made between one version of a document and another. You must now augment your stylesheet to support this new behavior.

One would like to add a single new template, or a small number of templates, that would implement the new functionality for the entire doctype. Something like this:

<xsl:template match="*[@diff='new']">
  <div class="new">
    <!-- do whatever you would have done for this element -->
  </div>
</xsl:template>

<xsl:template match="para">
  <p>
    <xsl:apply-templates/>
  </p>
</xsl:template>

When passed a document that contains <p diff='new'>...</p>, it would produce:

<div class="new">
<p>...</p>
</div>

         2.18 Could Make Coercions Symmetric By Allowing Scalar to Nodeset Conversion

Presently, no datatype can be coerced or cast to a node-set. By allowing a string value to convert to a node-set, some user "gotchas" could be avoided due.

3 Must Support XML Schema

XML Schema: Structures and XML Schema: Datatypes enable users to define and use both simple and structured types and associate them to elements and attributes in a schema. XSLT 2.0 MUST provide support for the common operations needed for matching and construction of transformed documents based on a source document containing these typed elements and attributes.

         3.1 Must Simplify Constructing and Copying Typed Content

It MUST be possible to construct XML Schema-typed elements and attributes. In addition, when copying an element or an attribute to the result, it should be possible to preserve the type during the process.

Ed. Note: Use Case needs work.

Use Case
  1. <href xsi:type="urireference">foo.xml</href>

  2. <href xsl:type="urireference"><xsl:value-of select="$foo"/></href>

  3. <href><xsl:typed-value-of select="$foo"/></href>

         3.2 Must Support Sorting Nodes Based on XML Schema Type

XSLT 1.0 supports sorting based on string-valued and number-valued expressions. XML Schema: Datatypes introduces new scalar types (for example, date) with well-known sort orders. It MUST be possible to sort based on these extended set of scalar data types. Since XML Schema: Datatypes does not define an ordering for complex types, this sorting support should only be considered for simple types.

Ed. Note: Should be consistent with whatever we define for the matrix of conversion and comparisons.

         3.3 Could Support Scientific Notation in Number Formatting

Several users have requested the ability to have the existing format-number() function extended to format numbers using Scientific Notation.

         3.4 Could Provide Ability to Detect Whether "Rich" Schema Information is Available

A stylesheet that requires XML Schema type-related functionality COULD be able to test whether a "rich" Post-Schema-Validated Infoset is available from the XML Schema processor, so that the stylesheet can provide fallback behavior or choose to exit with <xsl:message abort="yes"/>.

4 Must Simplify Grouping

Grouping is complicated in XSLT 1.0. It MUST be possible for users to group nodes in a document based on

  • common string-values
  • common names
  • common values for any other expression

In addition XSLT must allow grouping based on sequential position, e.g. selecting groups of adjacent <P> elements. Ideally it should also make it easier to do fixed-size grouping as well, e.g. groups of three adjacent nodes, for laying out data in multiple columns. For each group of nodes identified, it must be possible to instantiate a template for the group. Grouping must be "nestable" to multiple levels so that groups of distinct nodes can be identified, then from among the distinct groups selected, further sub-grouping of distinct node in the current group can be done.

Often users express this requirement in different words, asking for a way to easily select the distinct values of an XPath expression relative to a nodeset. For example, many users using keys have requested a function like distinct-keys(' keyname ') to return a node-set containing, for each value of the named key that is present in the current document, the first node in document order that has that key value. Others have suggested adding a select-distinct=" XpathExpression " to places where XSLT currently allows a select attribute.

Use Case
  1. Group by common values, groups unsorted, with group totals

    Given XML document:

    <cities>
      <city name="milan"  country="italy"   pop="5"/>
      <city name="paris"  country="france"  pop="7"/>
      <city name="munich" country="germany" pop="4"/>
      <city name="lyon"   country="france"  pop="2"/>
      <city name="venice" country="italy"   pop="1"/>
    </cities>

    Produce a 3-column table listing each distinct country in the first column, an alphabetical list of the city names for each country in the 2nd column, and the sum of the population for the cities in each country in the third column:

    <table>
      <tr>
        <th>Country</th>
        <th>City List</th>
        <th>Population</th>
      </tr>
      <tr>
        <td>italy</td>
        <td>milan, venice</td>
        <td>6</td>
      </tr>
      <tr>
        <td>france</td>
        <td>lyon, paris</td>
        <td>9</td>
      </tr>  
      <tr>
        <td>germany</td>
        <td>munich</td>
        <td>4</td>
      </tr>  
    </table>

  2. Group by common values, sorting the groups, with group totals

    Given same XML document as in use case 1 above, produce a 3-column table listing each distinct country in the first column (sorted in alphabetical order), an alphabetical list of the city names for each country in the 2nd column, and the sum of the population for the cities in each country in the third column:

    <table>
      <tr>
        <th>Country</th>
        <th>City List</th>
        <th>Population</th>
      </tr>
      <tr>
        <td>france</td>
        <td>lyon, paris</td>
        <td>9</td>
      </tr>  
      <tr>
        <td>germany</td>
        <td>munich</td>
        <td>4</td>
      </tr>  
      <tr>
        <td>italy</td>
        <td>milan, venice</td>
        <td>6</td>
      </tr>
    </table>

  3. Group by common values, sorting the groups by a group total

    Given same XML document as in use case 1 above, produce a 3-column table listing each distinct country in the first column (sorted in order of decreasing total population), a list of the city names for each country in the 2nd column (sorted in order of decreasing population), and the sum of the population for the cities in each country in the third column:

    <table>
      <tr>
        <th>Country</th>
        <th>City List</th>
        <th>Population</th>
      </tr>
      <tr>
        <td>france</td>
        <td>paris, lyon</td>
        <td>9</td>
      </tr>  
      <tr>
        <td>italy</td>
        <td>milan, venice</td>
        <td>6</td>
      </tr>
      <tr>
        <td>germany</td>
        <td>munich</td>
        <td>4</td></tr>  
    </table>

  4. Group by result of an expression (e.g. initial letter, with a count for each group)

    Given the input XML document above, produce the table below which groups by the initial letter of the city name, sorts these first-letters alphabetically, then produces a list of cities whose names begin with that letter. The heading contains a count of entries:

    <h2>L (1)</h2><p>lyon</p>
    <h2>M (2)</h2><p>milan</p><p>munich</p>
    <h2>P (1)</h2><p>paris</p>
    <h2>V (1)</h2><p>venice</p>

  5. Group by patterns of elements in a sequence

    Given the input:

    <body>
      <h2>heading1</h2>
      <p>para1</p>
      <p>para2</p>
      <h2>heading2</h2>
      <p>para3</p>
      <p>para4</p>
      <p>para5</p>
    </body>

    Produce the following output:

    <chapter>
      <section title="heading1">
        <para>para1</para>
        <para>para2</para>
      </section>
      <section title="heading2">
        <para>para3</para>
        <para>para4</para>
        <para>para5</para>
      </section>
    </chapter>

  6. Produce Hierarchical Nested Output from Flat Structure

    Given a source document like:

    <doc>
      <group1>
        <tag>value</tag>
      </group1>
      <group2>
        <tag>value</tag>
      </group2>
      <group2>
        <tag>value</tag>
      </group2>
      <group3>
        <tag>value</tag>
      </group3>
    </doc>

    produce the output:

    <doc>
      <group1>
        <tag>value</tag>
        <group2>
          <tag>value</tag>
        </group2>

        <group2>
          <tag>value</tag>
          <group3>
            <tag>value</tag>
          </group3>
        </group2>
      </group1>
    </doc>

  7. Formatting HTML Term Definition Lists (Case 1)

    Given a source document like:

    <DL>
      <!-- Handle the case with no DD or DT -->
      <DT>One</DT>  
      <DD>One Def</DD>
      <DT>Two</DT>  
      <DD>Two Def</DD>
      <DT>Three</DT>  
    </DL>

    produce the output:

    <OL>
      <LI><B>One<B> - <I>One Def</I></LI>
      <LI><B>Two<B> - <I>Two Def</I></LI>
      <LI><B>Three<B> - <I>(No definition provided)</I></LI>
    </OL>

  8. Formatting HTML Term Definition Lists (Case 2)

    A slightly more compliated version of the HTML term definition list involved multiple terms with a single definition or multiple definitions for a single term. Given the source

    <DL>
      <DT>One</DT>
      <DT>Two</DT>
      <DD>One and Two Def</DD>
    </DL>

    produce the output

    <OL>
      <LI><B>One, Two</B> - <I>One and Two Def</I></LI>
    </OL>

    For the other varation, given the source:

    <DL>
      <DT>One</DT>
      <DD>One Def</DD>
      <DD>Another One Def</DD>
    </DL>

    produce the output:

    <UL>
      <LI>
        <B>One</B>
        <OL>
          <LI>One Def</LI>
          <LI>Another One Def</LI>
        </OL>
      </LI>
    </UL>

  9. Transform Inline <para> Elements to Block <para> Elements

    Transform from a DTD that allows para elements to have nested block-level elements to a DTD that requires para elements to have only inline elements, e.g. transform:

    <p>Do <em>not</em>:
    <ul>
    <li>talk,</li>
    <li>eat, or</li>
    <li>use your mobile telephone</li>
    </ul>
    while you are in the cinema.</p>

    into:

    <p>Do <em>not</em>:</p>
    <ul>
    <li>talk,</li>
    <li>eat, or</li>
    <li>use your mobile telephone</li>
    </ul>
    <p>while you are in the cinema.</p>

  10. Arrange into Fixed-Sized Groups (Across/Down)

    Given the input from use case number 1 above, produce a two-column list of all city names, sorted alphabetically, in "Across/Down" format. The result should correctly format the "left over" cells when the number of items is not a multiple of the number of items in the group.

    <table>
      <!-- Alphabetized Across each row -->
      <tr><td>lyon</td><td>milan</td></tr>
      <tr><td>munich</td><td>paris</td></tr>
      <tr><td>venice</td><td>&nbsp;</td></tr>
    </table>

  11. Arrange into Fixed-Sized Groups (Down/Across)

    Given the input from use case number 1 above, produce a two-column list of all city names, sorted alphabetically, in "Down/Across" format. The result should correctly format the "left over" cells when the number of items is not a multiple of the number of items in the group.

    <table>
      <!-- Alphabetized Down each column -->
      <tr><td>lyon</td><td>paris</td></tr>
      <tr><td>milan</td><td>venice</td></tr>
      <tr><td>munich</td><td>&nbsp;</td></tr>
    </table>

  12. Multi-level Grouping

    Given the XML list of software bugs assigned to developers on different teams...

    <bugs>
      <bug dev="ace" team="ui">
        <desc>Border shows white when it should be grey</desc>
      </bug>
      <bug dev="tom" team="core">
        <desc>Incorrectly handling nulls on entry</desc>
      </bug>
      <bug dev="gary" team="ui">
        <desc>Preferences dialog has two Cancel buttons</desc>
      </bug>
      <bug dev="ace" team="ui">
        <desc>Drag and drop cursor never changes back</desc>
      </bug>
      <bug dev="tim" team="core">
        <desc>Infinite loop in validation code</desc>
      </bug>
      <bug dev="gary" team="ui">
        <desc>Resizing dialog doesn't resize text box</desc>
      </bug>
      <bug dev="ace" team="ui">
        <desc>German text is truncated</desc>
      </bug>
      <bug dev="tim" team="core">
        <desc>Data inserted twice</desc>
      </bug>
    </bugs>

    produce an HTML page that includes:

    • The total number of open bugs in the title
    • A column for each team of developers, with a per-team bug count. The teams should be ordered left to right by descreasing number of open bugs assigned to the team.
    • A vertical list (separated by <br/> tags) of developers on each team, with their bug count. Developers should be listed top down in order of descreasing individual bug count.

    The result is a nested, grouped output giving development managers a bird's-eye view of which teams and which developers are in the most "bug trouble":

    <html>
      <body>
        <h2>Bug Summary (8)</h2>
        <table>
          <tr>
           <th>ui (5)</th>
           <th>core (3)</th>
          </tr>
          <tr>
           <td>
             ace (3)<br/>
             gary (2)
           </td>
           <td>
             tim (2)<br/>
             tom (1)
           </td>
          </tr>
        </table>
      </body>
    </html>

  13. List all the different element names in a document, and for each one list all the attributes used, and for each attribute list the distinct values used in the document. For example, given the document:

    <foo baz="Q">
      <bar baz="3" bop="T"/>
      <foo baz="1">
        <bar bop="S" bip="4" baz="5"/>
      </foo>
    </foo>

    Produce the result:

    <inventory>
      <element name="bar">
        <attribute name="baz">
          <value>3</value>
          <value>5</value>
        </attribute>
        <attribute name="bip">
          <value>4</value>
        </attribute>
        <attribute name="bop">
          <value>S</value>
          <value>T</value>
        </attribute>
      </element>
      <element name="foo">
        <attribute name="baz">
          <value>1</value>
          <value>Q</value>
        </attribute>
      </element>
    </inventory>

3 References

XML Schema: Structures, http://www.w3.org/TR/xmlschema-1/
XML Schema: Datatypes, http://www.w3.org/TR/xmlschema-2/