Difficulty: ★☆☆ (easy)
Keywords: XInclude, modularization

Problem

You need a method to split your document into several “modules” and put it together afterwards.

Solution

Use XInclude. It is a W3C specification and defines the elements xi:include and xi:fallback. They are not DocBook elements (as they are defined by the W3C and not OASIS), however they have been integrated in version 5.x[3]. Note, XIncludes work in DocBook regardless which version (4.x or 5.x) you use.

If you want to use XIncludes, you need these things:

  • An XML parser that supports XIncludes.

  • The XInclude namespace http://www.w3.org/2001/XInclude, usually bound to the xi prefix.

  • The element xi:include. In general, it can be placed anywhere you can place DocBook elements. It is a placeholder for the real content and works as a reference.

  • The attribute href inside the <xi:include> start tag. It is a URI that refers to your included file.

  • The file that is referenced by the href attribute. The content of the file will replace the xi:include element. It usually contains a DocBook element.

The following example shows a book that points to a substructure assuming chapters:

Example 1.3. A Book with XIncluded Chapters
<book version="5.0"
  xmlns="http://docbook.org/ns/docbook"
  xmlns:xi="http://www.w3.org/2001/XInclude">
  <title>...</title>
  <xi:include href="intro.xml"/>
  <xi:include href="conceptual-overview.xml"/>
</book>

The above book contains an introduction (file intro.xml) and a conceptual overview (file conceptual-overview.xml). Both are referenced by the XInclude´s href attribute.

Before you transform your document, you need to resolve your XIncludes first, either by your XML parser or “manually” by an XSLT transformation. The following procedure shows a typical workflow:

Procedure 1.1. Typical Workflow with XIncludes
  1. Write your document structure, usually it will be a book or an article. Do not forget to include the XInclude namespace http://www.w3.org/2001/XInclude into the root element, commonly bound to the prefix xi.

  2. Add xi:include elements for those content you want to maintain in a separate file. Typically, this can be an appendix, chapter, preface, glossary, or any other DocBook element.

  3. Resolve your XIncludes. Use an XML parser that supports XIncludes, for example, xmllint from the libxml2 library. This XML parser brings the --xinclude option to resolve all your XInclude elements in one step:

    xmllint --xinclude --output big.xml book.xml

    The above command resolves all XIncludes and saves the result in the file big.xml. Note, this does not perform any validation! It just replaces xi:include with the content of the referenced file. After the XInclude elements are resolved, the file looks now like this:

    <book version="5.0"
      xmlns="http://docbook.org/ns/docbook"
      xmlns:xi="http://www.w3.org/2001/XInclude">
      <title>...</title>
      <chapter>
        <title>Introduction</title>
        <para>...</para>
      </chapter>
      <chapter>
       <title>Conceptual Overview</title>
       <para>...</para>
      </chapter>
    </book>
  4. Validate the result (in our example, it is big.xml) with your DocBook schema.

  5. Transform the result file with your stylesheets into your target format.

Discussion

The previous procedure showed a book with xincluded chapters. It is possible to even go deeper and also include a section into a chapter. Actually, there is no limit. You should only be aware that you do not create circular references (file A includes file B and B includes A).

As XIncludes are very common nowadays, resolving xi:include and transforming into the output format can be done in one step. This is the case for most tools:

xmllint from the libxslt library

Use the --xinclude option as shown:

xsltproc --xinclude STYLESHEET XMLFILE
Saxon 6

Unfortunately, Saxon 6 needs some more configuration. Most Linux distribution already have a saxon6 command. However, it can be difficult to correctly customize it, so this is the line you need:

java -Dorg.apache.xerces.xni.parser.XMLParserConfiguration=org.apache.xerces.parsers.XIncludeParserConfiguration \
   JARPATH/saxon6.jar:JARPATH/xml-commons-apis.jar:JARPATH/jaxp_parser_impl.jar:JARPATH/xml-commons-resolver.jar \
   com.icl.saxon.StyleSheet \
   -x org.apache.xml.resolver.tools.ResolvingXMLReader \
   -y org.apache.xml.resolver.tools.ResolvingXMLReader \
   -r org.apache.xml.resolver.tools.CatalogResolver\
   ARGS

The line contains different properties:

  • The org.apache.xerces.xni.parser.XMLParserConfiguration property sets the XInclude processor which is done by Xerces in this case.

  • The JARPATH is the path to your JAR files. In most FHS conformant Linux distributions, nowadays this is usually /usr/share/java.

  • Additionally, with the xml-commons-resolver.jar file, Saxon 6 is able to resolve catalogs. To tell Saxon you need to set the -r, -x, and -y options the URI resolver class, and the specified Sax parser for source file and stylesheet.

  • ARGS are the specific arguments for Saxon and contain source document and stylesheet. To list all available options, use -h.

Saxon 9

Version 9 contains the -xi option to resolve XIncludes (assuming you have a script saxon9 that does all the heavy Java lifting):

saxon9 -xi -xsl:STYLESHEET -s:XMLFILE

The last section showed a general method to work with XIncludes. In most cases this is enough. However, XIncludes offers more benefits that are discovered in the following subsections.

Fallbacks

If the referenced file in the xi:include element is not available, the XInclude step will fail. How can you avoid that? The XInclude specification also defines the xi:fallback element. This element can be used to add code when a referenced resource could not be retrieved:

Example 1.4. Fallback Possibility with xi:fallback
<xi:include href="revhistory.xml">
  <xi:fallback>
    <para>The revision history could not be retrieved.</para>
  </xi:fallback>
</xi:include>

The previous code does the following: When the xi:include element is being processed, the XML parser tries to include the file revhistory.xml. If the file can not be retrieved, the XML parser will consider the xi:fallback element and include its contents. In the above case it includes a para element showing the failed attempt.

This method is useful when you want to process files that might not permanently be available. For example, the previous revision history needs to be generated first. However, it is not always sure that the revision history can be generated from an possible offline version control system.

Including Text

The previous examples dealt with included resources in XML only. If you need to include text, this can also be done with XInclude.

The most common use-case is including source code that is maintained separately. The following example points to C source code that needs to be included as text:

Example 1.5. Included Text in a Programlisting
<programlisting language="c"><xi:include
   parse="text"
   href="parser.c"/></programlisting>

The important line is parse="text". This advises the XInclude processor to handle the referenced file as text and not as XML. The default value for parse is xml.

It is recommended to remove any whitespaces inside programlisting as shown above to avoid spurious indendation or linebreaks.

More explanations can be found in Section 1.8, “Incorporating External Files in Code Listings”.

See Also


[3] To use XInclude with DocBook 5.x, use the docbookxi.rnc RELAX NG schema.


Project@GitHubIssue#6