This document is the official description of the
Plucker format.
Overview
The Plucker document format supports a multi-page (in the Web sense of 'page') hyperlinked information structure containing both 'rich' text and images. Links must be internal to the document, though external links, in standard URL form, may be included and displayed, but not followed. Images may either be embedded in a text page, as with HTML, or be included as separate stand-alone pages.
Plucker documents are structured so that they can be used both with a file-system-oriented operating system such as Unix or Windows, and with the PalmOS, a non-file-system-oriented OS. To this end, they always begin with a standard PalmOS record database prefix, which consists of four parts: the database header, a record-id list, an AppInfo block, and a SortInfo block. The Plucker format does not use the SortInfo block, which is therefore null, and consequently occupies no space in the document prefix.
The record database prefix is then followed by a sequence of application-specific records. In a Plucker document, this sequence consists of one index record, followed by a series of data records. The index record contains information about the data records, along with some global information, such as the type of compression used. Each data record contains either a page, an image, or data about the document, such as bookmarks or URL data.
The format is big-endian; any multi-byte numeric values specified in this document are big-endian. Images are stored in the Palm image format; for more information on this format please consult http://www.palmos.com/dev/tech/docs/.
The database header is a fixed-size structure of 72 bytes. It contains the name of the database, the Plucker version number, various timestamps (creation, modification, last backup), and several flags. All timestamps are given using the PalmOS standard, seconds since 12:00 AM on January 1, 1904.
Field
Bytes
Type
Notes
docName
32
String
Must contain a NUL-terminated 7-bit ASCII string (only character codes 0x20-0x7E are valid) giving the name of the document. Because of the terminating NUL character at end, only 31 bytes can actually be used for the name of the document. The first 26 bytes of this string are used by Plucker as a unique ID for the document; names should be unique in the first 26 characters.
flags
2
Bitfield
Most bits in this field are unused. Unused bits should be set to zero on document creation, but reader software should not expect them to stay at this value.
Valid bits are as follows. All numeric values given are big-endian.
Name
Value
Meaning
CopyPrevention
0x0040
Indicates that system should not allow copying of this document.
Launchable
0x0200
Indicates that this document should be presented as a first-class object on desktop renderings. If this bit is set, an AppInfo block must be included.
Backup
0x0008
Indicates that this document should be backed up, if the system includes such a capability.
version
2
Numeric
Version of the Plucker format used in this document. Must have the value 1.
creationDate
4
Timestamp
Time of document creation
modificationDate
4
Timestamp
Time document last modified
unused1
8
Numeric
Must be zero at document creation, but any specific value should not be relied upon.
appInfoOffset
4
Numeric
Either zero, if no appInfo is present, or the offset from the beginning of the document to the start of the appInfo block.
sortInfoId
4
Numeric
Must be zero.
magic
8
String
Must be the 8 ISO Latin-1 characters "DataPlkr". No terminating NUL character.
unused2
4
Numeric
Must be zero at document creation, but any specific value should not be relied upon.
This list consists of a six-byte list header, followed by one ID entry for each data record in the document. The list header has the structure:
Field
Bytes
Type
Notes
nextRecordListID
4
Numeric
Must be zero.
numRecords
2
Numeric
Number of records in the document, including the index record.
This is then followed by numRecords entries of the following structure:
Field
Bytes
Type
Notes
recordOffset
4
Numeric
Number of bytes from the start of the document to the beginning of the record
attributes
1
Bitfield
Record attributes -- should be zero.
uniqueID
3
Numeric
A local (document-specific) unique ID for the record. This is not used by Plucker (because it is not preserved by PalmOS through beaming of a document), but must still be different for each record.
Finally, there are two bytes of zero-padding to bring the structure alignment back to 4 bytes.
Typically, this is only present when the launchable flag is set in the flags field of the database header. No Plucker data aside from icon display information and a versioning string is stored in this block. This block has the following structure:
Field
Bytes
Type
Notes
signature
4
Numeric
Must contain the value 0x6C6E6368.
hdrVersion
2
Numeric
Must have the value 3.
hdrEncoding
2
Numeric
Must have the value 0.
verStrWords
2
Numeric
The number of two-byte words following, containing the version string.
verStr
2 * verStrWords
String
NUL-terminated ISO Latin-1 string, padded at end if necessary with a zero byte to an even-byte boundary, containing a version string to display to the user containing version information for the document.
pqaTitleWords
2
Numeric
The number of two-byte words in the following pqaTitleStr.
pqaTitleStr
2 * pqaTitleWords
String
NUL-terminated ISO Latin-1 string, padded at end if necessary with a zero byte to an even-byte boundary, containing a title string for iconic display of the document.
iconWords
2
Numeric
Number of two-byte words in the following icon image.
icon
2 * iconWords
Image
Image (32x32) in Palm image format to be used as an icon to represent the document on a desktop-style display. The image may not use a custom color map.
smIconWords
2
Numeric
Number of two-byte words in the following icon image.
smIcon
2 * smIconWords
Image
Small image (15x9) in Palm image format to be used as an icon to represent the document on a desktop-style display. The image may not use a custom color map.
This record includes info about the compression type used
for the Plucker document and also what IDs the reserved records use.
The viewer will use this record to know where to look for the
reserved records and whether it must have support for ZLib
compression. This record should always be the first record in
the Plucker document (i.e. at index 0).
Field
Bytes
Type
Notes
uid
2
Numeric
unique ID for record, always 0x0001
version
2
Numeric
0x0002 if data is ZLib compressed, 0x0001 if DOC compressed
records
2
Numeric
number of reserved records
reserved
4*records
Numeric
reserved ID array
The reserved ID array consists of a series of name/ID pairs,
where the ID is the unique ID (2 bytes) for
the record and the name is a value (2 bytes)
from the following list.
Text data starts with a header, followed by a series of paragraph
headers before the compressed/uncompressed data, while all the
other types only have a header and data.
For text data the data record header is followed by a series of paragraph
headers, each representing a paragraph block in the text data.
Field
Bytes
Type
Notes
size
2
Numeric
Total length of paragraph before compression. NOTE: No text data should be larger than
32k. If the original document is larger than 32k, then the
parser have to split it into several records.
attributes
2
Bitfield
Paragraph info. The high-order 13 bits are reserved for future use and should be set to zero; the 3 low-order bits contain a numeric value in the range [0..7] giving the
amount of extra paragraph spacing (2*value pixels).
The (uncompressed) text data contains a character stream of ISO Latin-1 characters, interspersed with 'functions'.
A function is introduced in the text stream by a NULL character (0x00), followed by a one-byte function code
and up to 7 bytes of data. The 3 LSB of the function code represent the
remaining function data length; the 5 MSB denote the actual function
code. The following functions are valid:
Function Code
Description
Bytes
Arguments
0x0A
Page link begins
2
record ID
0x0C
Paragraph link begins
4
record ID, paragraph offset
0x08
Link ends
0
no data
0x11
Set font
1
font specifier
0x1A
Embedded image
2
image record ID
0x22
Set margin
2
left margin, right margin
0x29
Alignment of text
1
alignment
0x33
Horizontal rule
3
height, width (pixels), width (%)
0x38
New line
0
no data
0x40
Italic text begins
0
no data
0x48
Italic text ends
0
no data
0x5C
Multiple embedded image
4
alternate image record ID, image record ID
0x60
Underline text begins
0
no data
0x68
Underline text ends
0
no data
0x70
Strike-through text begins
0
no data
0x78
Strike-through text ends
0
no data
The function arguments have the following definitions:
Argument
Bytes
Notes
record ID
2
reference to record in Plucker document
image record ID
2
reference to image in Plucker document
paragraph offset
2
paragraph number (starting from 0) to jump to
font specifier
1
The font concept used in Plucker is that of a 'standard' font, along with bold and italic versions of that font. There is no font notion corresponding to HTML's <BIG> or <SMALL>. In this markup, boldness and size are specified with a font specifier; italic is specified with a separate function code. There are currently 9 font specification values, with the following meanings (the actual PalmOS fonts used by the Palm viewer are also given):
Value
Description
PalmOS 2.x
PalmOS 3.x
0
Regular text.
stdFont
stdFont
1
Suitable for <H1> HTML tags.
boldFont
largeBoldFont
2
Suitable for <H2> HTML tags.
boldFont
largeBoldFont
3
Suitable for <H3> HTML tags.
boldFont
largeFont
4
Suitable for <H4> HTML tags.
boldFont
largeFont
5
Suitable for <H5> HTML tags.
stdFont
boldFont
6
Suitable for <H6> HTML tags.
stdFont
boldFont
7
Regular text, but bold.
stdFont
boldFont
8
Fixed-width text, suitable for <TT> HTML tags.
stdFont
fixedWidthFont
left margin
1
left margin in pixels
right margin
1
right margin in pixels
alignment
1
alignment code (left = 0, right = 1, center = 2)
height
1
height of horizontal rule in pixels, if not given a default value
of 2 pixels will be used
width (pixels)
1
width in pixels, should be 0 if percentage value should be used
width (%)
1
width as the percentage between the current left and right margins.
The default is 100%
The image data consists of an image in Palm image format, compressed or uncompressed as specified in the document's index record. The image may in addition be internally compressed, via any of the compression techniques allowed in the Palm image format. Images must be less than 60k, uncompressed.
The mailto data contains info about e-mail addresses that are
referenced by the mailto anchors. All the offsets are counting
from the end of the header.
Field
Bytes
Type
Notes
to_offset
2
Numeric
offset to TO string
cc_offset
2
Numeric
offset to CC string
subject_offset
2
Numeric
offset to SUBJECT string
body_offset
2
Numeric
offset to BODY string
strings
0+
String sequence
A concatenated sequence of one or more NUL-terminated US-ASCII strings. Each contains a header-value, which follows the contraints on header values laid down in IETF RFC 2822. Header folding is not allowed. Any of the four headers shown above may be absent; header values should be accessed via the above offsets.
The URL data contains a list of the URLs. Additional records
are created if needed and contain up to 200 URLs.
Field
Bytes
Type
Notes
URLs
1+
String sequence
a concatenated sequence of NUL-terminated URL strings following the constraints of IETF RFC 1738. The list may contain up to 200 URLs (only text and image records are included,
other records are represented only by the presence of a NUL; that is, by an empty string)
These records may or may not be compressed. This is indicated
by the type in the header. These records are used by the Details
form to display the URL of the current record and by the External
Reference form to display the URL of not collected pages. From
either form you can copy the URL to a Memo to remind you to pluck
it at a later date.
Each Plucker document can be assigned to a number of named categories. This record stores the names of default categories for the document. The data consists of a concatenated series of NUL-terminated strings that
should be used as the default category/categories for this document.
There should only be one of these per document. This record begins with a two byte numeric value, giving the number of subrecords that follow, followed by that number of subrecords. The subrecords are a sequence of tagged variable length items. Each subrecord consists of three fields:
Field
Type
Bytes
Description
type code
Numeric
2
Specifies what piece of extra information is in this subrecord
length
Numeric
2
Number of 2-byte words in the argument
argument
(type code specific)
2 * length
Data
The following table describes the valid subrecord type codes, and describes the structure of the associated data for each subrecord type. Subrecords with unknown type codes should be ignored.
Type code
Name
Description
Argument
1
CharSet
This is the character set and encoding used by text records in this document, unless otherwise specified for particular records.
a two-byte numeric value, specifying the IETF IANA MIBenum value for the character set. See the IANA registry of character sets for valid values.
2
ExceptionalCharSets
This is a list of text records which use a charset other than that specified by the default CharSet. Note that if no default CharSet is specified, the default charset should be thought of as "unknown".
a sequence of (length / 2) record-ID, IANA-MIBenum pairs, where MIBenum values are as specified for CharSet. The invalid MIBenum value of 0 (zero) is used for records which have an unknown charset, if necessary.
Field
Bytes
Type
Notes
record ID
2
Numeric
unique ID for record
MIBenum
2
Numeric
IANA MIBenum for the character set used in this record