Jesper Tverskov, April 27, 2007

TOC for XHTML with XSLT

Making a TOC of nested list elements for an XHTML document by hand or by code is usually among the more tiresome or difficult tasks. With XSLT 2.0 it is relatively easy to transform the XHTML document to itself (identity transform) and let extra templates add the TOC, the links and the numbers.

1. Nested list elements

A TOC is a hierarchical structure and the markup must reflect that. A TOC must use nested ol/li elements or ul/li elements in order to make the TOC understandable in screen readers, and for nice and compact source code and to make further processing of XHTML documents easier.

Most often TOC solutions presented in XSLT tutorials and mailing lists are only getting close. The most common error is not to use nested list elements:

If we go to the W3C or OASIS web sites, we find hundreds of specifications and as many TOCs based on scores of designs, mostly wrong. As an introduction to this tutorial I have also published this article: The TOC freak show of W3C and OASIS, you might want to read it first.

Below is a TOC of correctly nested lists. Notice the submenus inside the li elements. Further below you see the visual presentation:

<ol>
  <li>Heading
    <ol>
      <li>Subheading</li>
      <li>Subheading</li>
    </ol>
  </li>
  <li>Heading</li>
  <li>Heading
    <ol>
      <li>Subheading</li>
      <li>Subheading</li>
    </ol>
  </li>
</ol>

The markup we see above will in almost any browser be rendered similar to what we see below. That is, the list items are numbered, but we don't get the numbers of the parent elements, like "1.1", automatically.

1. Heading
  1. Subheading
  2. Subheading
2. Heading
3. Heading
  1. Subheading
  2. Subheading

To get also the number of the parents inserted, we need to use CSS "list-style-type: none" to suppress the default. The number can then be added with CSS "counters" or as content of the list elements. This we are going to see in details in a moment.

Ordered lists (ol) can use one or more numbering schemes like decimal, roman numbers and letters. Unordered lists (ul) use bullets like circles, discs and squares in front of each list item (li). Decimal numbering is the most common for TOCs and is the one we will use in this tutorial.

1. Heading
  1.1 Subheading
  1.2 Subheading
2. Heading
3. Heading
  3.1 Subheading
  3.2 Subheading

The first level must be "1." not "1" in all TOCs of the condensed traditional type. It should be "first", "second", "third" item, not "one", "two" and "three". This is also how it will look in word processors like MS Word.

If we use CSS "counters" for the TOC, the period is missing for the first level. We can make an extra CSS rule to get the period back, as we are going to see in a moment. In more "styled" design playing around with colors, fonts and font-sizes, the use of no period for the first level of the TOC is common.

1.1 Heading elements h1-h6

XHTML has, like HTML, six heading levels, the h1-h6 elements. The draft for XHTML 2.0 also makes it possible to use nested section elements (new) and an h element (new) in order to have an explicit hierarchical structure. One benefit of this is that we can easily merge or split XHTML documents. Based on the number of nested sections browsers can interpret the h element as h1-h6.

Even in printed matter the least important heading levels are rarely used except in academic writing. In a huge specification like XSLT 2.0, only five levels are used (h1-h5). Most of my articles only have two or three levels.

The XSLT markup needed to generate a TOC for the h2 elements only is very simple. An "xsl:for-each" h2 element would do. It is much more complicated to generate several levels and get the nesting right. But it is nice to have XSLT markup in place that can take care of all levels the day we need them.

1.2 TOCs are undervalued

TOCs are not always used on the web even in situations where they could benefit the user. I will go as far as to way that TOCs for web pages are undervalued. Together with a summary, a TOC is the natural way to give an overview of the content of a web page.

1.3 TOC and site map

I have always detested site maps for web sites. They are most often generated automatically with little attention to details and with extremely poor usability, a far cry from even the most basic TOC for printed matter. It is said that a site map is an index to the web pages at a web site; a TOC is an index to the content of one web page.

I prefer to use the word TOC for all types of TOCs. Everybody knows what a "Table of Contents" is and how it can be useful. I often hear the question: "What is a site map?" Very many people don't know what a site map is or what to expect.

Even a web site could benefit from a true Table of Contents to all its web pages. In such a TOC the name of the web site becomes h1, and the h1 of each web page becomes h2, et cetera. Such a TOC for a web site, linking to all headings and subheadings of all the web site's pages, could be a useful entrance.

2. Using XSLT and CSS

I like TOCs even for short articles. Others will find TOCS for short articles overkill. It is likewise a matter of taste and of conflicting usability issues, how many heading levels we want to include in the TOC, and how many of the included levels should actually be links if any.

Making a TOC consists of four parts:

The day the browsers have better support for CSS "counters", we only need XSLT to generate the TOC and the links from the TOC to the headings. CSS can do the numbering of the items of the TOC and of the headings. Numbers for the headings are not always needed if we only make use of one or two levels or if the document is short.

We will make two solutions, one using XSLT for all four tasks and one where we use CSS "counters" to make the numbers. We need CSS to style the list elements in both solutions. We will include all the heading levels in the TOC. A more advanced solution could have parameters to set the number of levels to include in the TOC and how many levels should be links.

3. Using ordered or unordered lists

Most TOCs are true hierarchies where order and level are important. Nested ordered lists (ol/li elements) using numbers or letters, is the right choice for a TOC in most cases. I don't know why so many TOCs, e.g. for the specs at the W3C website, almost always use nested unordered lists (ul/li elements). But there is a good argument.

If we use CSS "list-style-type: none" to remove the default numbering or bullets from the li elements, and add decimal numbers as content of the list elements server-side, we run into problems if CSS is not supported in some user agents (browsers).

If "list-style-type: none" is not supported or if CSS is not supported, we get both the default number and the added number. This is extremely confusing to look at and even more confusing to listen to in a screen reader. Below we see how the beginning of the TOC of this article would look:

If we use unordered lists, we get both the default bullets (discs, circles, squares) and the added numbers. This is also redundant but at least it is not that confusing. Below we see how the beginning of the TOC of this article would look:

Today I don't know of any browser worth mentioning that does not support "list-style-type: none". But we must also consider the potential damage in the few browsers not supporting CSS at all like the Lynx text browsers or if the CSS stylesheet for some reason is not found.

4. Generating the TOC

We will make the XSLT templates in such a way that they will work for any valid XHTML document. The headings, h1-h6 can be children of the body element or they can be children of any number of div elements and the templates will still work.

There are two additional requirements if the XSLT templates are to be used without modification. All headings must be included in the TOC and the heading elements must have been used to make a correct implicit hierarchical structure. We can only have one h1 element starting the hierarchy. The h1 must be followed by h2 and h2 must be followed by another h2 or by h3, and h3 must be followed by another h3 or by h4 or by h2, et cetera.

An implicit hierarchical structure made up of h1-h6 elements is not something that is enforced by validation using W3C's DTD or an XML Schema schema for XHTML. You must either control the document manually or set up some code to test it. An XSLT stylesheet can be made to test if the implicit structure is correct, or you can use Schematron also making use of XSLT to do the same thing.

4.1 IE focus bug

For keyboard users the IE focus bug, still with us in Internet Explorer 7, can make a document inaccessible. The keyboard user is experiencing the bug when following a link using a fragment identifier and only the visual but not the so-called input focus changes. When the keyboard user tabs on to the next link, the visual focus is instead shifted back to the top of the document. [1]

Considering the importance of TOCs and cross-references in the web documents published by W3C, very many W3C specs until this very day (including the WAI specs) are inaccessible to keyboard users of Internet Explorer. There are several CSS hacks to overcome the IE focus bug. Sadly enough the hacks don't work for the p element in my testing. It is necessary to add div elements styled with the hack to all sections of your document just to get the TOC going. [2]

I have for a long time wanted to use an explicit hierarchical structure of nested div elements to get sections so useful in other markup language like DocBook and wordprocessingML and also introduced in the draft for XHTML 2.0. An explicit hierarchical structure of nested div elements would solve the IE focus bug for the TOC and minimize the problem for footnotes and cross-references. [3] The div sections could also be handy for further processing of the XHTML documents.

It is outside the scope of this tutorial also to generate an explicit hierarchical structure for XHTML. I will show how to do it in another tutorial. [4] By chance many of the W3C specs have a mess of div elements all over the place. Just by applying the CSS rule div{height:0} to those specs would overcome the IE focus bug. We will do just that at the end of the tutorial you are reading.

4.2 XSLT: toc.xsl

The new xsl:for-each-group element in XSLT 2.0, and this element's "group-starting-with" attribute, make generation of a TOC relatively easy. Below I have made an XSLT stylesheet, toc-raw.xsl, with the templates needed to generate a TOC. The solution is generic and should work for any XHTML input document. We will add markup to generate the links and the numbers in a moment.

Most of the templates below are identical except for the heading number. We could merge these templates into one template, but it would be rather complex, making it difficult to see what is going on.

<?xml version="1.0"?>
<!-- toc-raw.xsl generates the TOC without links and numbers -->

<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0" xmlns="http://www.w3.org/1999/xhtml" xmlns:xhtml="http://www.w3.org/1999/xhtml" exclude-result-prefixes="xhtml">

<xsl:template
name="toc"> [5]
  <xsl:for-each-group select="//xhtml:h1|//xhtml:h2|//xhtml:h3|//xhtml:h4|//xhtml:h5|//xhtml:h6" group-starting-with="xhtml:h1"> [6]
    <xsl:apply-templates select="." mode="toc"/>
  </xsl:for-each-group></xsl:template>

<xsl:template
match="xhtml:h1" mode="toc">
<xsl:if
test="following::xhtml:h2[1][preceding::xhtml:h1[1] = current-group()]"> [7]
  <ol>
    <xsl:for-each-group select="current-group() except ." group-starting-with="xhtml:h2">
      <xsl:apply-templates select="." mode="toc"/>
    </xsl:for-each-group>
  </ol>
</xsl:if>

</xsl:template>


<xsl:template
match="xhtml:h2" mode="toc"> [8]
  <li>
    <xsl:value-of select="."/>
    <xsl:if test="following::xhtml:h3[1][preceding::xhtml:h2[1] = current-group()]">
      <ol>
      <xsl:for-each-group select="current-group() except ." group-starting-with="xhtml:h3"> [9]
        <xsl:apply-templates select="." mode="toc"/>
      </xsl:for-each-group>
      </ol>
    </xsl:if>
  </li>
</xsl:template>


<xsl:template
match="xhtml:h3" mode="toc">
  <li>
    <xsl:value-of select="."/>
    <xsl:if test="following::xhtml:h4[1][preceding::xhtml:h3[1] = current-group()]">
      <ol>
        <xsl:for-each-group select="current-group() except ." group-starting-with="xhtml:h4">
          <xsl:apply-templates select="." mode="toc"/>
        </xsl:for-each-group>
      </ol>
    </xsl:if>
  </li>
</xsl:template>


<xsl:template
match="xhtml:h4" mode="toc">
  <li>
    <xsl:value-of select="."/>
    <xsl:if test="following::xhtml:h5[1][preceding::xhtml:h4[1] = current-group()]">
      <ol>
        <xsl:for-each-group select="current-group() except ." group-starting-with="xhtml:h5">
          <xsl:apply-templates select="." mode="toc"/>
        </xsl:for-each-group>
      </ol>
    </xsl:if>
  </li>
</xsl:template>


<xsl:template
match="xhtml:h5" mode="toc">
  <li>
    <xsl:value-of select="."/>
    <xsl:if test="following::xhtml:h6[1][preceding::xhtml:h5[1] = current-group()]">
      <ol>
        <xsl:for-each-group select="current-group() except ." group-starting-with="xhtml:h6">
          <xsl:apply-templates select="." mode="toc"/>
        </xsl:for-each-group>
      </ol>
    </xsl:if>
  </li>
</xsl:template>


<xsl:template
match="xhtml:h6" mode="toc">
  <li>
    <xsl:value-of select="."/>
  </li>
</xsl:template>

</xsl:stylesheet>

In the stylesheet above ordered lists (ol) are used being the proper element from an XHTML spec point of view. But considering the potential problems if CSS is not supported, unordered list elements is most often the better choice as discussed in section 3.

The templates of the above stylesheet, toc-raw.xsl, only generate the TOC. We need an XSLT stylesheet with an "identity" template that can include it, and we need to modify the stylesheet above in order to generate links from TOC items to the headings, and to generate the numbers for the TOC items and for the headings.

5. Links and numbers

What should we use for value for the id attributes in the heading elements? Best practice is to use the number of the heading level. If the heading is "2.1", we use "2.1" for id attribute value. But the id attribute in XHTML must have a value of type ID: it can't begin with a digit. I put an "s", short for section, in front of the number. The id then gets the value "s2.1" and the fragment identifier in the href of the TOC's list item becomes href="#s2.1"

5.1 Hrefs and numbering

If we take the basic toc-raw.xsl from before, we need a lot of extra markup to generate the numbers and the links to the headings. The template matching xhtml:h2 now looks like the following. The other heading levels use xsl:number several times to generate the full number (see the final XSLT stylesheets for details, or variable "nh6" in next section).

<xsl:template match="xhtml:h2" mode="toc">
<xsl:variable
name="nh2">
  <xsl:number count="xhtml:h2" level="any" format="1."/>
</xsl:variable>

<xsl:variable
name="pos">
  <xsl:number level="any" count="xhtml:h2|xhtml:h3|xhtml:h4|xhtml:h5|xhtml:h6"/>
</xsl:variable>

<li>

  <span>
    <xsl:if test="$pos mod 2 ne 0">
      <xsl:attribute name="class">altc</xsl:attribute>
    </xsl:if>
    <span>
      <xsl:if test="xs:integer(substring-before($nh2, '.')) lt 10">
        &#160;
      </xsl:if>
      <xsl:value-of select="$nh2"/>
    </span>
    <xsl:text> </xsl:text>
      <a href="#s{$nh2}">
        <xsl:value-of select="."/>
      </a>
  </span>
  <xsl:if test="following::xhtml:h3[1][preceding::xhtml:h2[1] = current-group()]">
    <ol class="toc">
      <xsl:for-each-group select="current-group() except ." group-starting-with="xhtml:h3">
        <xsl:apply-templates select="." mode="toc"/>
      </xsl:for-each-group>
    </ol>
  </xsl:if>
</li>

</xsl:template>

I use the variable "pos" to find out if an item is even or uneven in order to use alternating background-color. For that reason the content of the list item must be inside a span element. I prefer navy blue links, no underlining but alternating colors to make the TOC less dominating but still distinct and easy to read. Others prefer standard blue and underlining of each link.

The test="xs:integer…" is used to add a space before single digit numbers to get numbers (less than 100) right-aligned for the h2 level. I only use it for h2 where we often have more than nine items. The extra spaces are not worth it for the other levels.

The number is inside its own span element to make it possible to style it independently from the proper content of the list element. I use the same font for the numbers but reduce the font-size to 80% as we are going to see in the CSS section.

For creating the fragment identifier in the link from the TOC item to its heading, a "s" is inserted before the number. The xsl:text element is used to create a space between the number and the proper content of the list item.

5.2 Ids and numbering

The "identity" template (we will look at it in a moment) recreates the headings of the input XHTML document, but the templates for the headings in toc.xsl are more specific and overrules the "identity" template. Below you see the template for xhtml:h2.

Note that the variable "nh2" is used twice, both for id with a "s" and in front of the heading. Also note that the number is inserted in a span element to make it possible to style it differently than the proper content of the heading.

<xsl:template match="xhtml:h2">
<xsl:variable
name="nh2">
  <xsl:number count="xhtml:h2" level="any" format="1."/>
</xsl:variable>

<h2
id="s{$nh2}">
  <span>
    <xsl:value-of select="$nh2"/>
  </span>
  <xsl:text> </xsl:text>
  <xsl:apply-templates select="@*|node()"/>
</h2>

</xsl:template>

Below is the template for xhtml:h6 to give you an idea of how to use xsl:number again and again to cover all the levels. For all the details see the final XSLT stylesheet.

<xsl:template match="xhtml:h6">
<xsl:variable
name="nh6">
  <xsl:number count="xhtml:h2" level="any" format="1."/>
  <xsl:number count="xhtml:h3" from="xhtml:h2" level="any" format="1."/>
  <xsl:number count="xhtml:h4" from="xhtml:h3" level="any" format="1."/>
  <xsl:number count="xhtml:h5" from="xhtml:h4" level="any" format="1."/>
  <xsl:number count="xhtml:h6" from="xhtml:h5" level="any"/>
</xsl:variable>

<h6
id="s{$nh6}">
  <span>
    <xsl:value-of select="$nh6"/>
  </span>
  <xsl:text> </xsl:text>
  <xsl:apply-templates select="@*|node()"/>
</h6>
</xsl:template>

5.3 XSLT: toc.xsl versions

The basic toc-raw.xsl has now become toc.xsl also having markup creating the links from TOC items to headings. This is all we need if we want to use CSS "counters" (see later) to generate the numbers for the TOC items and for the headings client-side. In toc-plus.xsl, we use XSLT also to generate the numbers server-side.

6. XSLT: identity.xsl

Some of us make XHTML web pages by hand or generate them from relational databases or from some XML input document like wordProcessingML, ODF, etc. In this tutorial we use XHTML both as input document and as output document.

This is actually how I am working. I transform wordprocessingML to what I call server-side XHTML or my XHTML data store. This XHTML has no menu of navigation, no TOC, no footnote section, no numbering. A second XSLT stylesheet generates my XHTML for presentation, adding navigation, TOC, footnote section, etc.

The following XSLT stylesheet, identity.xsl, contains the "identity" template and a template to overrule the identity template at the place where the TOC is to be inserted. The templates to generate the TOC and the links from TOC to headings, and the numbering of TOC and headings, are in the included XSLT stylesheet, toc.xsl. In exactly the same way other templates can be included creating menu of navigation, footnote section, et cetera.

<?xml version="1.0"?><!-- identity.xsl -->
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0" xmlns="http://www.w3.org/1999/xhtml" xmlns:xhtml="http://www.w3.org/1999/xhtml" exclude-result-prefixes="xhtml"> [10]
<xsl:strip-space
elements="*"/> [11]
<xsl:include
href="toc.xsl"/>

<xsl:template
match="/">
<xsl:result-document method="xml"
href="myxhtml.html" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" indent="yes">
  <xsl:apply-templates select="node()"/>
</xsl:result-document>

</xsl:template>


<!-- Identity template -->
[12]
<xsl:template
match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>


<!-- This template overrules the identity template at a particuler place -->

<xsl:template
match="xhtml:p[1]">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
  <xsl:call-template name="toc"/>
</xsl:template>
</xsl:stylesheet>

The last template overrules the identity template at the place to insert the TOC. What to look for depends on the input document. We must find some paragraph like the first one if it is the paragraph of "summary", or the paragraph having an id attribute with the value "summary":

match="xhtml:p[@id = 'summary']"

We could also look for some subheading like match="xhtml:h2[text() = 'Abstract']" and insert the TOC before it or after it. It is the method that is important. Locate the node, restart the copying of the identity template and insert the TOC or the other way round. For a huge TOC we might also want to create a heading for the TOC just before we call the TOC template: <h2>Table of Contents</h2>.

7. Styling XHTML with CSS

When using XSLT to generate the TOC and the links from the TOC to the sections header elements (h2-h6) and for the numbering of the TOC and for numbering of the sections header elements, we still need CSS for styling in general.

<ul id="toc" class="toc">
  <li>
    <span class="ac">
      <span>1.</span>
      <a href="#s1.">Nested list elements</a>
    </span>
  </li>
  <li>
    <span>
      <span>2.</span>
      <a href="#s2.">Using XSLT and CSS</a>
    </span>
  </li>
  <li>
    <span class="ac">
      <span>3.</span>
      <a href="#s3.">Generating the TOC</a>
    </span>
    <ul class="toc">
      <li>
        <span>
          <span>3.1</span>
          <a href="#s3.1">For-each-group</a>
        </span>
      </li>
    </ul>
  </li>
<!-- et cetera -->

7.1 Styling ol/ul elements

When we generate the numbers for the TOC with XSLT we must get rid of the default numbers or bullets used by the ol/ul elements. Also, depending of how we have styled the ol/ul elements in general, we must make sure that margin-top and margin-bottom are 0em to be able to present also the first li elements of nested lists like the rest of the li elements. In the following ul is used as example:

ul.toc{list-style-type:none; margin-top: 0em; margin-bottom: 0em}

In the source code of section "7" we have both an id="toc" and a class="toc" in the first ul element. The id can be used to style the TOC's ul container differently from other ul elements of the TOC and the class="toc" makes it possible to style all ul elements of the TOC the same for most properties but differently from ul elements not part of the TOC.

7.2 Alternating background-color

The content of each li element is inside a span element to make it possible to use alternating background-colors for easier reading and to make the TOC more distinct if underlining of the links is not used. For alternating background-color I use #F5F5F5, white smoke.

7.3 Styling the list numbers

The numbers we generate are also inside span elements in order to style them differently from the rest of the content of the li element. I prefer to use the same font for the number but to make it smaller than the proper content. We don't need a CSS class attribute for the second span. In the CSS stylesheet we can get to it like this:

.toc li span span{font-size: 80%}

When we generate the numbers with XSLT we must remember to generate them in such a way that we get a non breaking space (&#160;) in front of numbers lower than 10 in order to get all numbers lower than 100 aligned to the right (see section 5.1). I only find this relevant for the h2 level.

7.4 Styling the links

We don't need a class inside the a elements of the TOC. In the CSS stylesheet we can get to them like this:

#toc a:link{…}

How to style the links in the TOC in a way that looks nice is not easy. I feel that underlining makes the TOC too dominating and ugly especially if we have a long TOC with many levels, but is it mostly a matter of taste.

One solution could be only to underline the first level. I have decided to use no underlining at all but that makes the TOC look rather pale and blurred with standard blue color. To make the list items distinct and easy to read I use navy blue and alternating background-color.

Since all the links in the TOC are internal (they point to different sections in the same page), it is meaningless to use a special color for visited link.

7.5 Styling the heading numbers

Most web pages being short don't need numbering in front of the headings. But I recommend numbering of headings for more serious writing even for short articles having only 4-5 h2 headings.

Numbering makes is easier to see at a glimpse, where you are in the document. If you see a heading number saying "4." you know you are four headings down the page. This is a great help in web pages where page number don't make sense.

<h2 id="s1."><span>1.</span> Nested list elements</h2>

It is convenient to have the number in a span element in order to style the number differently than the proper content of the heading. We don't need a CSS class for the span element. In the CSS stylesheet we can get to it like this:

h2 span, h3 span, h4 span, h5 span, h6 span{font-size: 80%}

For my web documents like the article you are reading, I use a smaller font-size for the numbers than for the rest of the content. I want to make it easier to distinguish the number from the heading's proper content, and I don't want the number to attract too much attention.

8. CSS counters for lists

The solution so far has used XSLT not only to generate the TOC and the links but also to generate the numbers. The ideal is to let Cascading Style Sheets (CSS) take care of the numbers in order not to pollute our document with redundant information. It gives us nice and clean source code, and makes further processing of XHTML documents easier.

The Opera browser is one of the few that supports CSS "counters" both for the TOC's list elements and as "content" for the heading elements. Firefox 2.0 does not support the last. Internet Explorer 7 supports neither CSS "counters" for list elements or for "content" (see test document in a moment).

We could still decide to make several versions of our web pages, some using CSS "counters" and some using server-side techniques like XSLT to make the numbers. Since we cannot test the browsers for objects available in this case, we would also need to set up problematic browser-sniffing techniques.

8.1 CSS can not add markup

Even the day CSS "counters" works well in most browsers, we might decide not to make use of it for one good reason. With CSS we can only add text not markup. Both for the list elements in the TOC and for headings I prefer that the number is styled differently than the content. I prefer to make it clear visually that the number is distinct from the content.

For the numbers I use a smaller font-size. You might have other requirements. The bottom line is, that if we want to style the numbers differently than the proper content, we must wrap them in span elements. We can't do that with CSS.

8.2 No period for first level

The CSS rules below are what we need to number the TOC. Notice the space in quotes in the middle of the last line to get a space between the period and the rest of the content.

ol {counter-reset: item}
li {
display:block}
li:before {
content: counters(item, ".") " "; counter-increment: item}

The "item" in the above CSS is just a name. You can make up another one if you like.

But the CSS above is not giving most of us, what we want. In browsers supporting the above CSS our XHTML will look like this:

1 Heading
  1.1 Subheading
  1.2 Subheading
2 Heading
3 Heading
  3.1 Subheading
  3.1 Subheading

Note that a period after the first level is missing. "1" is pronounced "one", "1." is pronounced "first". In lists we have "first", "second" and "third" item. "One", "two" and "three" are simply not grammatically correct.

8.3 Extra CSS rule to get period

We need to add one more rule to our CSS in order to have a period also after the first level. I have made up a "first" CSS class to be used in all list elements of first TOC level. It will give us the output we want, but the list elements of first level (h2) must now use the class attribute, <li class="first">:

li.first:before{content: counters(item, ".") ". " ; counter-increment: item}

8.4 Bowser support

If CSS is not supported the fallback is the default numbers or bullets of XHTML, which could be considered good enough. But many browsers like IE7 support "list-style-type: none" but not CSS "counters" for list elements. Here fallback is a TOC with default numbering or bullets removed but no new numbers.

Considering that a nice looking TOC is important for a first good impression of a web page, it is almost impossible to start using CSS "counters" before they are widely supported. We would need too much and too dubious browser-sniffing to serve nice documents to most browsers.

I have made a test document, toctest.html, using only XSLT to generate the TOC (for simplicity the links have not been generated). The CSS rules discussed above and in the next section are included in the head section to generete the numbers for the TOC and for the headings.

9. CSS counters for content

For long articles, documentation, reports, etc., not only the TOC but also the headings made with h2-h6 markup could make good use of numbers like: "3.2.2", like in this article. Such numbers can also be made with CSS rules using "counters":

h2:before{content: counter(h2) ". "; counter-reset: h3; counter-increment: h2}
h3:before{content: counter(h2) "." counter(h3) " "; counter-reset: h4; counter-increment: h3}
h4:before{content: counter(h2) "." counter(h3) "." counter(h4) " "; counter-reset: h5; counter-increment: h4}
h5:before{content: counter(h2) "." counter(h3) "." counter(h4) "." counter(h5) " "; counter-reset: h6; counter-increment: h5}
h6:before{content: counter(h2) "." counter(h3) "." counter(h4) "." counter(h5) "." counter(h6) " "; counter-increment:h6}

Note that "h2" in "counter(h2)" and "h3" in "counter(h3) are just names. In the CSS spec and in most tutorials the names in a similar example would probably be "chapter", "section", "subsection". I use the heading names for brevity, and it makes sense that a counter for the h2 level is called "h2".

9.1 Browser support

The above CSS for "content" is even less supported by browsers than CSS "counters" for list elements. It works in Opera but not in Firefox 2.0, and not in Internet Explorer 7, just to give you an impression (see the test document above).

When we are able to use CSS all the way, the source code can be made very tidy and the XHTML document will be much easier to process if it is one day needed. Someone might want to split a big XHTML document into its sections, or merge sections marked up in individual documents into one document, or to transform your XHTML document into another format.

10. Using a W3C spec for testing

We have come a long, long way. Let us recapitulate a few guidelines for a TOC:

  1. Nested ordered list elements are the natural choice for a TOC.
  2. CSS counters should be used when supported by browsers.

  3. If numbers are made as content, unordered lists is recommended.

  4. Most TOCs should also have a period after first level.

  5. The item number should not be included in the link.

  6. The number should also be used for fragment identifier.

  7. The TOC must overcome the IE focus bug.

For a final test of the TOC solution in this tutorial, I have selected the W3C XSLT 2.0 Recommendation. It is published as valid XHTML, easy to manipulate with standard XML tools. Let us give it my TOC. Modifications of our templates were needed. Firstly, the spec has three h2 headings at the beginning of the document not to be included in the TOC. Secondly, the spec has also appendices.

I have added a colon to the first level of appendices to make them easier to read and to understand in screen readers: "A: References" is better than "A References". I have also added conditional comments with a CSS hack to overcome the IE focus bug.

It is a matter of taste if the new TOC for the spec, xslt20spec-new.html, looks better. We are so used to the old one. But we get nested list elements and a period also after the first level, and this is one of the very few W3C specs also working for keyboard users of Internet Explorer.

Footnotes

[1]

Jim Thatcher, Skip Navigation Links (2005) and WebAim.org, "Skip navigation" links are good introductions to the IE focus bug. For more advanced discussions see Gez Lemon, Keyboard Navigation and Internet Explorer (2005) and the excellent article by Ingo Chao, On having layout (2006).

[2]

Not only IE but also other browsers might have similar bugs for other reasons. My Opera 9.1 does not jumb to the first link after the anchor when tabbing with the "a" letter but to one of the following links!

[3]

Until Microsoft gets its act together a complete illimination of the IE focus bug using CSS hacks would require that each anchor linked to is inside its own container element not shared with another anchor. My tests actually indicate that nesting is not necessary. Any div element before the anchor (also an empty element like <div style="height:0"/>) would work.

[4]

In the month of May 2007 my articles and tutorials are examples of documents that have not overcome the IE focus bug. I have two excuses. Firstly, my documents have so few internal links except for TOC and footnotes, that navigation will not be that difficult for keyboard users of Internet Explorer. Secondly, my documents would need a new structure with div elements all over the place. Somehow a little drastic to overcome a bug.

[5]

The first template is named "toc" so we can call it at the right moment from the stylesheet containing the identity template. The rest of the templates matches the headings (h1-h6) but have a mode="toc" attribute to make sure that they are only fired when the TOC is generated.

[6]

Note that the select uses wildcards, "//", to get to all the headings of the XHTML document no matter how the markup is made. The headings can be children of the body element or nested inside div elements, etc.

[7]

The template for h1 is a little tricky since we only have one h1 element. Basically the template only generates the ol/ul container for the TOC if it finds an h2. When you understand the next template go back to this one.

[8]

The template says: "For each h2 element generate a li element and copy (xsl:value-of) the content of the h2 element over. If we after the h2 element, no matter how far from it, has a h3 element, the first one we find, and if we before this h3 element has a h2 element (we have at least the one we are processing), the first one we find, and if this h2 element is the one we are processing (current-group) then create an ol/ul element, etc.

[9]

<xsl:for-each-group select="current-group() except ." group-starting-with="xhtml:h3"> is a little difficult to understand. The current group starts with an h2 and the content of this group except the h2 itself should be sub divided into h3 groups, etc.

[10]

Strangely enough we must declare the XHTML namespace twice. This is the secret behind transforming XHTML to XHTML, see my article: Transform XHTML to XHTML with XSLT.

[11]

The strip-space element is most often necessary in XSLT processors like SAXON. Since many elements of XHTML has mixed content it might also be necessary to use the preserve-space element if we have whitespace only text notes we don't want to strip as part of such content. See my article: Tricky whitespace handling in XSLT.

[12]

If the identity template is new to you, see my article: Identity Template: xsl:copy with recursion.

Updated 2009-08-06