Whole document tree
    

Whole document tree

Building Variant DTDs Based on DocBook

Chapter 1. Building Variant DTDs Based on DocBook

DocBook is "cannibalized" as often as it is used in its original form. It has a modular structure and uses parameter entities to ease the process of using the desired parts of DocBook and modifying them as necessary. This book explains how to create variants of DocBook using its modules and parameter entities.

There are two main methods for building DocBook variants: You can start with both of the standard main modules and customize them, or you can reuse one of the modules in building a new DTD. You can also use the two methods in combination, for example, by using just one module and also customizing it. Regardless of the techniques you use, building a variant DTD requires careful planning.

  • Build variants only by redefining the original entities, if possible; don't edit any of the original modules except for the driver file.

  • Favor markup changes that place tighter validation restrictions on documents (subsets), rather than changes that would allow instances that no longer conform to standard DocBook (extensions).

  • Negotiate all changes with your interchange partners and document not only the substance of the changes, but the reasons for them.

You can make two kinds of changes to DocBook "for free"; they don't fundamentally alter its markup model, and you can make these changes in the main DocBook DTD file, in your own driver file, or in an internal subset without considering your version a variant:

  • Declaring additional data content notations

    DocBook declares many common non-SGML notations. Your environment may need additional notations.

  • Declaring and referencing general entities of any kind (even if the entity declarations are stored in an external parameter entity)

    DocBook declares and references all 19 of the ISO 8879 annex entity sets. You may need additional character entities and/or may need to declare general entities that can be used as boilerplate text. If you remove any declarations of ISO entities, your DTD will be considered a variant of DocBook because valid DocBook documents that use any of the missing entities would be invalid under your DTD.

Even though you can make these changes at will, you should still document them in your answers to the interchange checklist (provided in the Overview) when you share your files.

All other changes, no matter how small, should result in the use of a formal public identifier (if you use one) that is different from that of the original module or driver file. You should change both the owner identifier and the description. The original DocBook formal public identifiers use the following syntax (note that the owner identifier has changed from "HaL and O'Reilly" to "Davenport"):

-//Davenport//{DTD|ELEMENTS} DocBook description version//EN

Your own formal public identifiers should use the following syntax in order to record their DocBook derivation:

-//your-owner-ID//{DTD|ELEMENTS} DocBook Vn.n[.n]-Based [Subset|Extension|Variant] your-descrip-and-version//lang

For example:

"-//DocTools//DTD DocBook V2.3-Based Subset V1.1//EN"

If your DTD is a proper subset, you can advertise this status by using the "Subset" keyword in the description. If your DTD contains any markup model extensions, you can advertise this status by using the "Extension" keyword. If you'd rather not characterize your variant specifically as a subset or an extension, you can leave out this field entirely, or, if you prefer, use the "Variant" keyword.

Setting Up DocBook for Customization

To customize DocBook by starting from the standard version:

  1. Decide whether you will store the customizations in a .dtd driver file that will function as "the DTD" for your purposes, or in an internal subset (between the square brackets in a DOCTYPE declaration directly inside a document). Most likely, you will want to create a new driver file so that you can easily use your variant DTD with many documents. (Note that SGML-aware applications that compile DTDs usually don't handle markup customizations in internal subsets.) Your set of customizations can be thought of as a "customization layer."

  2. From within your customization layer, reference as parameter entities either the original DocBook driver file or both of the main modules, depending on your customization needs. The original driver file has been set up to be usable for nearly all kinds of customization, but you may need or want to construct your own driver file from scratch. Keep in mind that creating your own driver file without including the notations and ISO entity sets found in the original driver file will result in a DocBook subset in this respect.

  3. Add your customizations around the references to the original DocBook material.

The main method for customizing DocBook elements and attributes is the overriding of existing parameter entities, that is, the declaration of a parameter entity with the same name as one that is declared later in the linear flow of the DTD. The redeclarations go in your driver file or internal subset in the appropriate locations to override the original values. With certain redeclarations in place, you might also declare new element types and attribute lists and supply replacement declarations for existing element types and attribute lists.

Following are templates for DocBook customizations, starting with the most common and desirable. Example 1-1shows a customization template using a reference to the original driver file from a higher-level driver file (which might be named myvariant.dtd).

Example 1-2 shows a customization template using direct references to the original modules from a higher-level driver file. You will need to supply your own notation declarations and references to ISO entity sets in this scenario; leaving out any that DocBook itself supplies will create a subset in this respect.

Example 1-3 shows a customization template using a reference to the original driver file from an internal subset. Note that parsers read the declarations in the subset before the declarations in the remote portion (the original DTD), so that the local declarations take precedence over whatever is in the original file. Note also that it is generally considered bad practice (even if supported in your software) to change the markup model of a DTD "on the fly" like this. If you must make any customizations that rely on declarations positioned after the original DTD file (such as new element declarations that contain references to entities defined earlier), you can't use this method.