XPath position() function returns unexpected results

If you process XML documents generated "manually" (including documents composed within a program without proper XML tools), you might be surprised by the results returned by with position() function. For example, the input XML document ...
<?xml version="1.0" encoding="UTF-8" ?>
<list>
<author>John Brown</author>
<editor>Jane Doe</editor>
<author>Jim Small</author>
<editor>Grace Kelly</editor>
</list>
... processed with a simple XML stylesheet ...
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="text" />

<xsl:template match="list">
Results: <xsl:apply-templates />
</xsl:template>

<xsl:template match="author|editor">
<xsl:value-of select="text()" /> (<xsl:value-of select="position()" />)
</xsl:template>

</xsl:stylesheet>
... returns surprising results:
  Results:
John Brown (2)

Jane Doe (4)

Jim Small (6)

Grace Kelly (8)
The reason for this unexpected behavior are whitespace text nodes between the author and editor elements which are also counted by the position() function. To skip them, either create XML documents without extra whitespace or use more specific xsl:apply-template statements, for example:
<xsl:template match="list">
Results: <xsl:apply-templates select="*"/>
</xsl:template>
The select="*" option in the last example selects only child nodes of the current XML node and thus skips over text fragments (including whitespace nodes).

No comments:

Post a Comment