The Entity Manager
******************
SGML can refer to an external file (really entity) with an _external
identifier_, this is a _public identifier_ or a _system identifier_, or
both.
A typical public identifier looks like
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN"
where "ISO 8879:1986" is the owner, "ENTITIES" is the text class and
"Added Latin 1" is the text description (and "EN" is language).
A system identifier looks like
SYSTEM "htmlplus.dtd"
where "htmlplus.dtd" is a system-specific identifier.
To map external identifiers to file names, PSGML first searches
entity catalog files and then search the list of file name templates in
the variable `sgml-public-map'.
The catalog format is according to SGML/Opens resolution on entity
management. The catalog consists of a series of entries and comments.
A comment is delimited by `--' like in a markup declaration. The entry
types recognized are described in the following table.
`public PUBID FILE'
The FILE will be used for the entity text of an entity with the
public identifier PUBID.
`entity NAME FILE'
The FILE will be used for the entity text of an entity with the
name NAME. If the NAME starts with a `%' the rest of the name
will be matched against parameter entities.
`doctype NAME FILE'
The FILE will be used for the entity text of an entity used as
external subset of a document declaration with NAME as document
type name.
`sgmldecl FILE'
Used to specify a default SGML declaration. Recognized but not
used by PSGML other than to pass to an external validation command
(`sgml-validate-command').
When PSGML is looking for the file containing an external entity, the
following things will be tried in order:
1. Try the system identifier, as a file name, if there is a system
identifier and the variable `sgml-system-identifiers-are-preferred'
is non-`nil' and there is no elements containing `%s' in
`sgml-public-map'. If the system identifier is a relative file
name it will be relative to the directory containing the defining
entity.
2. Look thru each catalog in `sgml-local-catalogs' and
`sgml-catalog-files' in order. For each catalog look first for
entries matching the public identifier, if any. Then look for
other matching entries in the order they appear in the catalog.
Currently an entry will be ignored if it is matching but its file
is non-existent or unreadable. (This is under reconsideration,
perhaps it should signal error instead).
3. Try the system identifier, if any, as a file name. If
`sgml-system-identifiers-are-preferred' is `nil' and there is no
elements containing `%s' in `sgml-public-map'.
4. Try the entries in `sgml-public-map'. Using the catalogs are
preferred. The `sgml-public-map' may disappear in a future version
of PSGML (not soon though).
The `sgml-public-map' variable can contain a list of file name
templates where `%P' will be substituted with the whole public
identifier, owner is substituted for `%O', public text class for `%C',
and public text description for `%D'. The text class will be converted
to lower case and the owner and description will be transliterated
according to the variable `sgml-public-transliterations'. The
templates in the list is tried in order until an existing file is
found. The `sgml-public-map' is modeled after `sgmls' environment
variable `SGML_PATH' and psgml understand the following substitution
characters: %%, %N, %P, %S, %Y, %C, %L, %O, %T, and %V. The default
value of `sgml-public-map' is taken from the environment variable
`SGML_PATH'.
Given the public identifier above and the file name template
`/usr/local/lib/sgml/%o/%c/%d', the resulting file name is
/usr/local/lib/sgml/ISO_8879:1986/entities/Added_Latin_1
Note: blanks are transliterated to `_' (and also `/' to `%') and the
text class is down cased.
- User Option: sgml-catalog-files
This is a list of catalog entry files. The files are in the
format defined in the SGML Open Draft Technical Resolution on
Entity Management. The Emacs variable is initialized from the
environment variable `SGML_CATALOG_FILES' or if this variable is
undefined the default is
("CATALOG" "/usr/local/lib/sgml/CATALOG")
- User Option: sgml-local-catalogs
A list of SGML entity catalogs to be searched first when parsing
the buffer. This is used in addition to `sgml-catalog-files', and
`sgml-public-map'. This variable is automatically local to the
buffer.
- User Option: sgml-system-identifiers-are-preferred
If `nil', PSGML will look up external entities by searching the
catalogs in `sgml-local-catalogs' and `sgml-catalog-files' and
only if the entity is not found in the catalogs will a given system
identifier be used. If the variable is non-nil and a system
identifier is given, the system identifier will be used for the
entity. If no system identifier is given the catalogs will
searched.
- User Option: sgml-public-map
This should be a list of file name templates. This variable is
initialized from the environment variable `SGML_PATH'. This is
the same environment variable that `sgmls' uses. If the
environment variable is undefined the default is
("%S" "/usr/local/lib/sgml/%o/%c/%d")