Whole document tree
    

Whole document tree

SP - Features Summary

SP

A free, object-oriented toolkit for SGML parsing and entity management

Features summary

  • Includes nsgmls
    • Compatible with sgmls
    • Also generates RAST (ISO/IEC 13673)
  • Provides access to all information about SGML document
    • Access to DTD and SGML declaration as well as document instance
    • Access to markup as well as abstract document
    • Sufficient to recreate character-for-character identical copy of any SGML document
  • Supports almost all optional SGML features
    • Arbitrary concrete syntaxes
    • SHORTTAG, OMITTAG, RANK
    • SUBDOC
    • LINK (SIMPLE, IMPLICIT and EXPLICIT)
    • Only DATATAG and CONCUR not supported
  • Sophisticated entity manager
    • Supports ISO/IEC 10744 Formal System Identifiers
    • Supports SGML Open catalogs
    • Supports WWW
    • Can be used independently of parser
  • Supports multi-byte character sets
    • Parser can use 16-bit characters internally
    • 16-bit characters can be used in tag names and other markup
    • Supports ISO/IEC 10646 (Unicode) using both UCS-2 and UTF-8
    • Supports Japanese character sets (Shift-JIS, EUC)
  • Object-oriented
  • Written in C++ from scratch
    • Not a modified version of a parser originally written in C
    • Reentrant
    • Sophisticated architecture
  • Fast
    • Up to twice as fast as sgmls on large documents
  • Portable
    • All major Unix variants
    • MS-DOS
    • Win32: Windows 95/Windows NT
    • OS/2
  • Production quality
    • Version 1.0 recently released, after a year of test releases
    • Tested using several SGML test suites
    • Already used in several new commercial products
    • Written by James Clark, previously responsible for turning arcsgml into sgmls
  • Free
    • Includes source code
    • No restrictions on commercial use
  • Disadvantages
    • Programmer-level documentation only for generic API and not for native API.

James Clark
jjc@jclark.com