Secrets of OpenVMS® File I/O


Previous page... | Contents

Chapter 3
VMS BACKUP I/O CONSIDERATIONS

3.1 Data-Block Layout

VMS BACKUP generates two types of specially formatted data-blocks. The first type contains pure-data and consists of:

The second type of data-block contains redundancy group data. These data blocks are used for readback error CORRECTION.

 
  
            A VMS BACKUP PURE DATA BLOCK  
+--------------------------------------------------+  
|seq#|data|data|data|data|data|data|data|......|crc|  
+--------------------------------------------------+  
  
  
         A VMS BACKUP REDUNDANCY DATA BLOCK  
+--------------------------------------------------+  
|redundancy data for last group of user data-blocks|  
+--------------------------------------------------+  
  

3.2 The Effects of Block Size

The size of each data-block is determined by the /BLOCK_SIZE qualifier. The use of large blocks can greatly speed up backup operations and allow more data to be put onto a single reel of tape. However, when using low-density tape devices, care must be taken.

Increasing the block size speeds up data written to tape and increases disk performance because a larger block of data can be read from disk.

VMS BACKUP will use from three to five memory buffers to do its work. Each buffer contains one data-block. A 9-track reel of tape has only 10 feet of writable surface past the end-of-tape reflector. Therefore, when BACKUP detects the EOT reflector, it must write all remaining buffers to tape within that 10 feet.

If a /BLOCK_SIZE=40960 is used with a low-density tape device (1600 BPI), 25 inches of tape are consumed each time a data-block is written to tape (40960/1600 = 25). If five buffers are being used by BACKUP, over 10 feet of tape will be consumed when the EOT reflector is encountered. This could cause the tape to spin off the end of the reel. If when writing one of the blocks, BACKUP gets a tape error, it will try to rewrite the block to tape. If the tape is already positioned past the EOT reflector, the tape may spin off the end of the reel trying to complete the write operation.

3.3 Sequence Number Usage

Each data-block is written with a BACKUP supplied sequence number. If the tape drive detects a write error, then BACKUP rewrites the data-block using the same sequence number. The sequence number is incremented only after a "good" block has been written. When restoring, data-blocks are only written when the sequence number changes. In this way, only the "good" data-blocks are restored.

3.4 Error Detection and Correction

By default, VMS BACKUP does both error DETECTION and error CORRECTION. This is because VMS BACKUP must work on a variety of tape and disk devices and yet still maintain data integrity.

3.4.1 Cyclic Redundancy Check

For error DETECTION, BACKUP calculates a CRC (cyclic redundancy check) value, or "checksum" on each data-block. When restoring a VMS save-set, a CRC for the data-block is recalculated and compared with the CRC stored within the data-block. If the two values do not match, BACKUP knows that the data-block is bad. The /CRC qualifier controls the use of this feature. /NOCRC turns off the feature.

3.4.2 Redundancy Group Data

For error CORRECTION, BACKUP uses redundancy group data. This data is used to reconstruct bad data-blocks being restored. A setting of ten (the default) means that if any one data-block in a group of ten data-blocks restored is bad, the one data-block can be reconstructed using the redundancy group data from the surrounding nine data-blocks. The default setting of ten causes BACKUP to devote over 17% of each data-block to redundancy group data. The /GROUP=nn qualifier controls the use of this feature. /GROUP=0 turns off the feature.


The following are performance-based suggestions. These suggestions have been tried successfully at Touch Technologies. However, TTI makes no representations or warranties regarding the effectiveness or risks associated with the use of these commands provided by Digital Equipment Corporation.

When using VMS BACKUP, the following BACKUP qualifiers are suggested:

If you have a tape device that does hardware-level error detection (check with the manufacturer...most do), use the /NOCRC qualifier. This will eliminate the extra overhead of both BACKUP and the tape device calculating CRC values.

If you have a tape device that does hardware-level ECC (error correction code), use the /GROUP=0 qualifier. This will reduce the extra overhead of having both BACKUP and the tape device calculating and storing ECC data to the tape. (TK50/70s, and 6250 bpi drives all have hardware-level ECC)

If you have a high-density tape device (6250 bpi or more), use the /BLOCK=40960 qualifier. This will reduce the stop/start processing that is required for inter-record gap creation. The number 40960 is a multiple of both 512 (a TK50/70 blocksize) and 8192 (blocking used by helical scan tape devices).

If you have a Low-density tape device (1600 bpi or less), use the /BLOCK=16384 qualifier. This will reduce the stop/start processing that is required for inter-record gap creation.

3.4.3 Tape Devices with Hardware CRC and ECC

The following tape devices do their own hardware-level error detection (CRC) and error correction (ECC):

For these hardware-level CRC/ECC devices, both /NOCRC and /GROUP=0 should be used. The /NOCRC because the tape device is already generating CRC data and writing it to the tape. The /GROUP=0 because the tape device is already generating ECC redundancy data and writing it to the tape. A typical BACKUP command might be:

 
  
  $ backup/rewind/image dua1: -  
      mub0:backup/save/BLOCK=40960/NOCRC/GROUP=0  
  

3.4.4 Tape Devices with Minimal Error Correction

Low-density tape devices (800/1600 bpi) perform error detection, but only minimal error correction. For these devices a typical BACKUP command might be:

 
  
  $ backup/rewind/image dua1: -  
      msa0:backup/save/BLOCK=16384/NOCRC  
  


Appendix A
VOCABULARY - MEASURING HOW I/O BOUND YOU ARE

PHYSICAL I/O

When VMS has to go to a device to satisfy an I/O request.

VIRTUAL I/O

When an I/O request can be satisfied from a data cache. No physical I/O is needed.

VIRTUAL MEMORY

VMS is a virtual memory system. Programs can be run that are bigger (have more address space) than actual physical memory available.

I/O REQUEST

When a process requests to read or write data.

CACHE

Contains copies of data recently used by an application.

HOT FILE

A file with a high level of activity (high I/O count).

BIT

Consists of a single 1 or 0.

BYTE

One character. One byte = 8 bits (1's and 0's). "DAN" = 3 bytes.

BLOCK

A block is 512 bytes.

KILOBYTE

Approximately one thousand (1,024 ) bytes.

MEGABYTE

Approximately one million (1,024,000) bytes.

GIGABYTE

Approximately one billion (1,024,000,000) bytes.


Appendix B
VOCABULARY - REDUCING FILE I/O BOTTLENECKS

SHADOW SET

Data is duplicated on two disks to provide redundancy in case of a disk failure. Also speeds up I/O operations.

HOST BASED DATA CACHING

Uses free memory for high-speed data caching.

RMS LOCAL BUFFERING

RMS can set aside the most commonly used blocks of a file in a memory buffer (cache). These blocks are available only to the process that has them cached. Reads from memory are faster than reading from disk. As the number of RMS local buffers is increased, more I/O requests can be satisfied from this local buffer cache.

RMS local buffers are not shared among users.

RMS GLOBAL BUFFERING

The same as local buffering except that all processes accessing the globally buffered file have access to the cached blocks. (Global data buffers are shared among processes.) Speeds up file reads. As the number of RMS global buffers are increased, more I/O requests can be satisfied from this global buffer cache.

CONTIGUOUS

Physically adjacent; all in one piece.

FRAGMENTED FILE

Files that are not physically contiguous.

FILE DEFRAGMENTATION

Making files contiguous.

DISK DEFRAGMENTATION

A process that causes all the files on the disk to become physically contiguous. Speeds up reads and writes.

RMS FILE CONVERSION

As RMS based files are written to, they become internally fragmented and disorganized. Over time, both read and write operations cause extra physical I/O operations to the RMS file. The Digital-provided CONVERT utility is used to defragment and reorganize RMS files.


Appendix C
VOCABULARY - VMS BACKUP I/O CONSIDERATIONS

DEVICE DRIVER

Software that acts as an interpreter between the operating system (software) and a controller (firmware).

CONTROLLER

Firmware that acts as an interpreter between a device driver (software) and a device (hardware).

PSEUDO DEVICE DRIVER

Software that acts as an interpreter between the operating system and another device driver.

9-TRACK TAPE DRIVE

There are nine tracks of information written to the tape at a time.

TAPE SUB-SYSTEM

Consists of the tape controller (firmware) and the tape drive (hardware).

8MM TAPE DRIVE

Tape drive that holds an 8mm tape. Can store 2.3 gigabytes (billion bytes) of data.

EXABYTE, 8MM, HELICAL SCAN

Synonyms people use to refer to Exabyte 8mm Tape Drive.

DENSITY OF TAPE

Amount of data that can be stored per inch of tape. Measured in BPI (Bits Per Inch), per Track.

GCR

Group Coded Recording. 6250 BPI 9-track tape drives use GCR.

PE TAPE DRIVE

Phase Encoded. 1600 BPI 9-track tape drives use PE.

KB/SEC (KILOBYTES PER SECOND)

1,024 bytes per second (KB/sec).

IRG

Inter-Record Gap. Gap between blocks of data written to tape.

START-STOP TAPE DRIVES

The tape physically stops at the end of a block of data for inter-record gaps, then starts again.

STREAMING TAPE DRIVES

The tape is kept continuously moving as long as there is enough data. If there is NOT enough data, the tape has to stop.

ECC - ERROR CORRECTION CODE

Error Correction Code has two purposes:

ERROR DETECTION

There are two kinds of error detection:

PARITY

A parity check detects errors.

PARITY DETECTION

Each bit is read from the tape. If the parity is incorrect, there is a parity error.

VERTICAL PARITY

Vertical parity on each byte.

HORIZONTAL PARITY

Horizontal parity on each data block.

CRC and CRC DETECTION

Cyclic Redundancy Code (same as CHECKSUM). Used in ERROR DETECTION.

BACKUP

Data is moved from the immovable disk onto removable tapes and is stored off-site for safe keeping.

IMAGE BACKUP

Backs up the entire disk. Copies all files unless they are flagged not to be backed up.

INCREMENTAL BACKUP

Backs up only those files that have been modified/created since the last time that backup was run.

RESTORE

Restore information - Transfer data from a tape back onto the disk. (This is the real reason that tape backups are made).

Backup/Restore reads the file headers. In the file headers the qualifiers backup used when writing the saveset are stored. Backup/Restore then uses the same qualifiers when writing from tape to disk. Backup/Restore ignores any qualifiers in the Backup/Restore command.


Index | Contents