The Digital
Log Interchange Standard defines data organization on two levels,
on the Logical Format level and
on the Physical Format level.
The Logical Format concerns the syntactic and semantic organization
of a sequence of data characters. The Physical Format concerns
the location and physical organization of data on various media.
The Logical Format is embedded in a Physical Format,
but the two formats are very loosely bound to support reasonable
access efficiency without unduly constraining the elements of the
Logical Format. Consequently, before an application on a given
system can process DLIS data recorded on a given medium, it must
- understand the Logical Format of the data, and
- be able to map elements between the Logical
Format and the Physical Format.
The DLIS
Logical Format consists of the following elements:
- the Logical Record, a group of 8-bit
bytes or data characters
- the Logical File, a group of related Logical Records.
The relationship and ordering of these elements
is illustrated in Figure 2-1. At the byte level, DLIS data is an
ordered stream of 8-bit bytes, in which byte k precedes byte k+1.
Within each byte, bit 1 is the high-order bit.
Figure 2-1. Logical Format
Each distinct piece of information in the Logical Format
has a well-defined representation
that extends across one or more bytes.
All permissible representations
are listed in Appendix B both by symbolic name and by the one-byte
Representation Code used to designate the representation
explicitly in the Logical Format.
Symbolic names of the Representation Codes are
provided for use in this specification. It is likely that these
names will correspond to identifiers in program code, and they are
restricted to six characters for this reason. A two-byte signed
integer quantity, then, is said to have Representation Code SNORM,
whereas a single precision floating point number has Representation
Code FSINGL, a variable-length string of text has Representation
Code ASCII, and so on.
Logical
Records form the basic coherent bodies of information in the DLIS
Logical Format. They encapsulate semantically related information
within a Logical File. Each Logical Record consists of one or more
consecutive Logical Record Segments, which provide the interface
between the Logical Format and the Physical Format. Logical Record
Segmentation is dependent on the type of Physical Format.
For example, it is the responsibility of the
Logical Record Segments, not the Logical Records, to align with
Physical Format boundaries. Segmentation also permits processing
Logical Records of indefinite length.
A Logical Record Segment is composed of four
mutually disjoint parts:
- a Logical Record Segment Header,
- an optional Logical Record Segment Encryption Packet,
- a Logical Record Segment Body, and
- an optional Logical Record Segment Trailer.
The term "Logical Record" distinguishes
the DLIS element from a physical disk or tape record.
Each Logical
Record Segment begins with a Logical Record Segment Header. The
LRSH format is defined in Figure 2-2.
Where information in the LRSH applies to a Logical
Record, rather than to a Logical Record Segment, that information
must be used consistently in all Segments of a Logical Record.
Redundant recording of such information permits a uniform structure
for the LRSH and provides knowledge about the Logical Record (e.g.,
what kind of information has been lost) that may not be otherwise
available if, for example, the first Logical Record Segment is damaged.
Entry
| Representation Code
| Comments
Logical Record Segment Length
| UNORM
| 1
|
Logical Record Segment Attributes
| (not defined)
| 2
|
Logical Record Type
| USHORT
| 3
Figure 2-2. Logical Record Segment Header
| |
Comments:
- 1.
The Logical Record Segment Length
is a two-byte, unsigned integer (Representation Code UNORM) that
specifies the length, in bytes, of the Logical Record Segment.
The Logical Record Segment Length is required
to be even. The even length ensures that 2-byte checksums can be
computed, when present, and permits some operating systems to handle
DLIS data more efficiently without degrading performance with other
systems. There is no limitation on a Logical Record length.
Logical Record Segments must contain at least
sixteen (16) bytes. This requirement facilitates mapping the Logical
Format to those Physical Formats that require a minimum physical
record length.
- 2.
The Logical Record Segment Attributes
consist of a one-byte bit string that specifies the Attributes of
the Logical Record Segment. Its structure is defined in Figure
2-3. Since its structure is defined explicitly in Figure 2-3, no
Representation Code is assigned to it.
- 3.
The Logical Record Type is a one-byte,
unsigned integer (Representation Code USHORT) that specifies the
Type of the Logical Record. Its value indicates the general semantic
content of the Logical Record. The same value must be used in all
Segments of a Logical Record.
Logical Record Types are specified in Appendix A.
Bit
| Description
| Comments
1
| Logical Record Structure
- 0 = Indirectly Formatted Logical Record
- 1 = Explicitly Formatted Logical Record
| 1
|
2
| Predecessor
- 0 = This is the first segment of the Logical Record
- 1 = This is not the first segment of the Logical Record
| 2
|
3
| Successor
- 0 = This is the last Segment of the Logical Record.
- 1 = This is not the last Segment of the Logical Record
| 3
|
4
| Encryption
- 0 = No encryption.
- 1 = Logical Record is encrypted
| 4
|
5
| Encryption Packet
- 0 = No Logical Record Segment Encryption Packet
- 1 = Logical Record Segment Encryption Packet is present
| 5
|
6
| Checksum
- 0 = No checksum
- 1 = A checksum is present in the LRST
| 6
|
7
| Trailing Length
- 0 = No Trailing Length
- 1 = A copy of the LRS lengt is present in the LRST
| 7
|
8
| Padding
- 0 = No record padding
- 1 = Pad bytes are present in LRST
| 8
Figure 2-3. Logical Record Segment Attributes
| |
Comments:
- 1.
The Logical Record Structure bit
specifies which Logical Record syntax the DLIS reading software
can expect (see Chapter 3). This bit must be applied the same way
in all Segments of a Logical Record.
- 2.
The Predecessor bit indicates whether
this is the first Segment in the Logical Record.
If the Predecessor bit is not set,
then this is the first Segment. If the Predecessor
bit is set, then this is not the first Segment.
- 3.
The Successor bit indicates whether
this is the last Segment in the Logical Record. If the Successor
bit is not set, then this is the last Segment. If the Successor
bit is set, then this is not the last Segment.
- 4.
The Encryption bit indicates whether
the Logical Record Segment Body and Pad Bytes (see comment 8) are
encrypted. This bit must be applied the same way in all Segments
of a Logical Record. In particular, encryption is applied at the
Logical Record level and, therefore, cannot result in encryption
of only part of a Logical Record Body. Logical Record Segment Headers,
Encryption Packets (see comment 5), checksums (see comment 6), and
Trailing Lengths (see comment 7) are never encrypted. Pad Bytes
are included when encryption is applied to allow for encryption
algorithms that require a minimum number of bytes or certain multiples
of bytes.
Encryption is used by an organization to
record information that is considered proprietary.
- 5.
The Encryption Packet bit indicates
whether there is an Encryption Packet in the Logical Record Segment.
The Encryption Packet bit may not be set unless the Encryption
bit is also set. It must be set in the first Segment of
a Logical Record whenever the Encryption bit is set. It is optional
otherwise.
Section 2.2.2.2 describes the location, contents
and use of the Encryption Packet.
- 6.
The Checksum bit indicates whether
there is a checksum in the Logical Record Segment Trailer (LRST).
- 7.
The Trailing Length bit indicates
whether there is a copy of the Logical Record Segment Length in
the LRST. This bit must be applied the same way in all Logical
Record Segments of a Logical File. In particular, all Logical Record
Segments in a Logical File must have Trailing Lengths or none have
them. This requirement exists to permit traversal of a Logical
File in a backward direction.
- 8.
The Padding bit indicates the presence
of Pad Bytes and a Pad Count in the LRST. If Padding
is present, the Pad Count is a single byte (Representation Code
USHORT) that contains a count of Pad Bytes present in the LRST.
The Pad Count is considered one of the Pad Bytes, so the Pad Count
may have the value 1. The remaining Pad Bytes, if any, immediately
precede the Pad Count in the LRST. The Pad Count precedes the checksum
and the Trailing Length when these are present.
Padding is a transparent mechanism for achieving
minimum Logical Record Segment size and even Logical Record Segment
length when these conditions would not otherwise occur. Pad Bytes,
other than the Pad Count, have no meaning, and their values are
arbitrary.
The Logical
Record Segment Encryption Packet, if present, immediately follows
the Logical Record Segment Header. The format of the Encryption
Packet is described in Figure 2-4.
Bytes
| Description
| Comments
1-2
| Size of Encryption Packet in bytes (UNORM)
| 1
|
3-4
| Producer's Company Code (UNORM)
| 2
|
5-end
| Encryption information
| 3
|
Figure 2-4. Definition of Logical Record Segment Encryption Packet
| |
Comments:
- 1.
Bytes 1-2 are mandatory and specify the
size of the Encryption Packet in bytes. The Encryption Packet must
consist of an even number of bytes.
- 2.
Bytes 3-4 are mandatory and specify the
integer Company Code of the Producer of the Logical Record (see
§4.1.9).
- 3.
The encryption information is optional
(i.e., the size of the Encryption Packet may be 4 bytes) and contains
information used by the Producer to identify and/or initialize
encryption of data in the current Logical Record.
The encryption or decryption of information
in a Logical Record must not be dependent on information in the
Encryption Packets of another Logical Record, since there is no
guarantee that the other Logical Record will be available when the
information is needed.
The Logical
Record Segment Body is an ordered set of 8-bit bytes that immediately
follow the Logical Record Segment Encryption Packet, when the Encryption
Packet is present, or otherwise immediately follow the Logical Record
Segment Header.
The Logical
Record Segment Trailer, if present, immediately follows the Logical
Record Segment Body and consists of any combination of: Padding,
a 2-byte checksum (see Appendix E), and/or a Trailing Length. Padding
precedes a checksum or a Trailing Length. A checksum precedes a
Trailing Length.
The Logical
Record Body consists of the ordered union of the Logical Record
Segment Bodies of all Logical Record Segments that make up the Logical
Record.
Figure 2-5 illustrates a sample Logical Record
decomposed into three Logical Record Segments.
LRSL
| 10100110
LRT
| body
| CHECKSUM
| TRAILING LRSL
|
LRSL
| 11100110
LRT
| body
| CHECKSUM
| TRAILING LRSL
|
LRSL
| 11000111
LRT
| body
| PADDING
| CHECKSUM
| TRAILING LRSL
Figure 2-5. Illustration of a Three-Segment Logical Record
| | | |
A Logical
File consists of a sequence of one or more Logical Records, beginning
with a File Header Logical Record (FHLR, see §5.1 and Appendix
A), and containing no other FHLRs. A Logical File is terminated
when another FHLR is encountered or when no more Logical Records
are available for the Logical File.
The term "Logical File" distinguishes
the DLIS element from a physical disk or tape file.
Any FHLR
must consist of exactly one Logical Record Segment. It is useful
to be able to handle an FHLR as a file label, and this is one of
the requirements necessary to make that possible.
Physical Format is the way in which recorded data is located and organized
on a physical medium, such as a magnetic tape or disk. The specific
binding of Logical Format to Physical Format depends on the medium
and the access mechanism.
This section defines bindings for Record-Structured
Physical Formats, including the industry-standard 9-track magnetic
tapes as a special case. Bindings for other Physical Formats are
not defined here.
The term
Storage Unit is defined loosely as something that contains
recorded data and that is manageable as a unit at the human level.
When applied to magnetic tape, Storage Unit refers to a single
physical reel of tape. When applied to disks, Storage Unit refers
to a single file. The term is used only when no distinction between
different media is intended; the common terms "tape" and
"file" are used when the context is targeted strictly
at magnetic tapes or at disk files, respectively.
A sequence of Logical Files can reside on a single
Storage Unit or a single Logical File can span multiple Storage
Units. When a Logical File begins on one Storage Unit and ends
on another, the Storage Units that it intersects constitute part
of a Storage Set. Further definition of a Storage Set is
provided in §2.3.4, "Storage Set Requirements."
All access mechanisms apply a structure to data
recorded on a Storage Unit. The following structure is covered
in this specification:
- Sequential Record Structure permits
access to data in sequential, variable-length records. Records
must be written sequentially in a forward direction, but can be
read sequentially in either a forward or backward direction, either
directly or by backspacing over a record and then reading it in
a forward direction. A Storage Unit on which the Physical Format
has been written with a Sequential Record Structure is called a
Record Storage Unit.
A Physical Format can be partitioned into three
mutually disjoint parts: the Logical Format, the Invisible Envelope,
and the Visible Envelope, respectively. These parts are
illustrated in Figure 2-6. The Logical Format is data that is of
interest to applications. The Invisible Envelope is data that is
managed by the access mechanism and is not part of normal data read
and write transactions. Invisible Envelope data is typically part
of the control interface between the access mechanism and applications
or is available through special queries. For example, most disk
operating systems maintain file header control information that
is separate from the file data. Record Structure files can also
contain record lengths that are passed as control between the operating
system and the application but are not passed as data. The Visible
Envelope is information that is passed as data and is important
in defining a particular Physical Format, but data that is not part
of the Logical Format.
Except for industry-standard magnetic tapes,
a specification of the Invisible Envelope is beyond the scope of
this document.
Figure 2-6. Partitions of a Physical Format
The first
80 bytes of the Visible Envelope consist of ASCII characters and
constitute a Storage Unit Label.
Figure 2-7 defines
the format of the SUL.
Field
| Size in Bytes
| Comments
Storage Unit Sequence Number
| 4
| 1
|
DLIS Version
| 5
| 2
|
Storage Unit Structure
| 6
| 3
|
Maximum Record Length
| 5
| 4
|
Storage Set Identifier
| 60
| 5
Figure 2-7. Format of Storage Unit Label
| |
Comments:
A Storage
Unit must have exactly one Storage Unit Label that must appear before
any Logical Format data. The first record in the Visible Envelope
of a Record Storage Unit; must consist of an SUL.
A Storage Unit must contain an integer number
of Logical Record Segments. It need not contain an integer number
of Logical Records.
A Storage
Set was introduced in §2.3.1 as a group of Storage Units across
at least two of which resides a single Logical File. With the introduction
of an SUL, it is now possible to complete the definition of Storage
Set. A Storage Set is a group of one or more Storage Units that
satisfies the following conditions:
- All Storage Units have the same Structure.
- The Storage Set Identifier field in the
SUL is identical for all Storage Units.
- The set of Storage Unit Sequence Numbers
in the SULs form the sequence 1, 2, 3, ..., n, where n is the number
of Storage Units in the Storage Set.
- The Storage Set contains a single Logical
Format, which is partitioned sequentially into n parts. For any
k, part k is contained in Storage Unit k.
The Storage Set is provided to cover situations
in which a Logical File overflows a Storage Unit and must be continued
on another. This typically need only occur with magnetic tapes,
although it is permitted to occur with any type of Storage Unit.
Notice, however, that the actual requirements stated above do not
demand Logical File continuation across members of a Storage Set,
neither do they demand that the Storage Set Identifier be distinct
for all Storage Sets. The implementation of Storage Sets and Storage
Set Identifiers is left to users who shall decide how they can best
suit the users' needs.
A Storage
Unit may simply run out of data, which is one way for it to terminate.
On industry standard 9-track magnetic tapes a
Storage Unit is terminated by two consecutive Tape Marks (see §2.3.7.1).
These marks belong to the Invisible Envelope of the tape.
A Visible
Record on a Record Storage Unit consists of all data bytes passed
to an application as a result of a normal record read operation.
On a Record Storage Unit, each Visible Record
other than those in the Invisible Envelope or those that contain
a Storage Unit Label must contain a positive integer number of Logical
Record Segments, and all Segments must belong to the same Logical
File. That is, a Logical Record Segment cannot span Visible Records
and Visible Records cannot intersect more than one Logical File.
This requirement permits a DLIS reader always
to locate the beginning of the next Logical Record on the Storage
Unit. If Trailing Lengths are recorded, it permits backward recovery
of Logical Record Segments when the first Logical Record Segment
Length in the Visible Record is damaged.
According
to sections 2.3.1 and 2.3.6, a Visible Record consists of Visible
Envelope data plus one or more Logical Record Segments. Other than
the Storage Unit Label, a Visible Record on a Record Storage Unit
must contain the following parts in the order described:
- A Visible Record Length, expressed
in terms of Representation Code UNORM (part of the Visible Envelope)
- A two-byte Format Version Field (part
of the Visible Envelope, see section 2.3.6.2)
- One or more complete Logical Record Segments
(part of the Logical Format)
The Visible Record Length specifies the sum of
the lengths in bytes of these three parts.
Following
the Visible Record Length in each Visible Record is a two-byte field,
called the Format Version. This belongs to the Visible Envelope.
The first byte is FF (hex), which distinguishes the Visible Record
from records of other, older formats. The second byte is an integer
(USHORT) specifying the major version number of the format, which
for this specification is the value 1.
For a Storage
Set consisting of Record Storage Units, the Visible Envelope consists
of
- the Storage Unit Label(s)
- the Visible Record Lengths
- the Format Versions
No explicit
minimum record length is required. Note that Logical Record Segments
must be at least 16 bytes long. When the Visible Record Length
and Format Version are included, the combination yields an implicit
minimum length of 20 bytes, which is sufficient for known devices
to handle.
The maximum Visible Record Length permitted on a Record
Storage Unit is 16,384
bytes.
2.3.7 Industry-Standard 9-Track Magnetic Tape
No constraint
is imposed on the type of magnetic media that a company uses to
record DLIS information for its private use. However, any standard
DLIS tape access utility may be required to read or write DLIS information
recorded only on industry-standard 9-track tapes that are written
at a density of 800, 1600, or 6250 bits per inch.
To ensure uniformity of access to tape, which
is a removable medium, the Invisible Envelope of a Physical Format
that is recorded on industry-standard 9-track magnetic tape.; must
consist of physical tape marks. The complete Physical Format encountered
on such a tape, then, consists of tape marks, Storage Unit Label,
Visible Record Lengths, Format Versions, and Logical Record Segments.
This is illustrated in Figure 2-8. The use of tape marks is described
in "Physical Tape Marks," which is the following section.
Figure 2-8. Illustration of Magnetic Tape Physical Format (1st Reel)
Physical tape marks constitute the Invisible Envelope on industry-standard
9-track magnetic tapes. Such tapes contain two indelible marks,
called BOT and ETW in Figure 2-8. BOT is near the physical beginning
of the tape and indicates the start of the region in which recorded
information is permitted. ETW is required to be a minimum distance
from the physical end of the tape and serves as a warning; with
many systems ETW can be sensed only when writing.
Industry-standard 9-track magnetic tapes also
employ a form of tape mark which is not indelible and that can appear
multiple times. Marks of this type are called TM in Figure 2-8.
A TM is a distinct form of recorded information and takes the place
of a physical record. When an industry-standard 9-track magnetic
tape serves as a Storage Unit, the TM shall be used as follows:
-
There shall be a single TM following the
BOT and immediately preceding the Storage Unit Label. When reading
a tape, any information found between the BOT and the first TM shall
be discarded as "noise" attributable to the mechanical
variation in recording devices. The Storage Unit Label shall be
the first record on the tape following the first TM.
-
There shall be at least two consecutive
TMs following the last of the Logical Format or the Visible Envelope
information that is recorded on tape. Two consecutive TMs function
as a Storage Unit Terminator.
-
Exactly one TM shall separate the first
Visible Record of a Logical File from any preceding Logical Format
or Visible Envelope information.
Programs
that move DLIS information from one Physical Format to another need
to carry enough knowledge of the standard to ensure that the result
is a valid DLIS Physical Format. The knowledge required depends
on the sophistication of the program and can include one or more
of the following:
- Tape marks
- Storage Unit Labels
- Record Storage Units
- Visible Record Lengths
- File Header Logical Records
- Logical Record Segments
- Arbitrary Logical Records
- Logical Files