DB2 Z/os Load utility 使用

时间:2023-12-19 23:37:08

Use the LOAD online utility to load one or more tables of a table space. The LOAD utility loads records into the tables and builds or extends any indexes that are defined on them.

If the table space already contains data, you can choose whether you want to add the new data to the existing data or replace the existing data.

The loaded data is processed by any edit or validation routine that is associated with the table, and any field procedure that is associated with any column of the table. The LOAD utility ignores and does not enforce informational referential constraints.

The LOAD utility uses a number of data sets during its operation.

The following table lists the data sets that LOAD uses. The table lists the DD name that is used to identify the data set, a description of the data set, and an indication of whether it is required. Include statements in your JCL for each required data set, and any optional data sets that you want to use. Alternatively, you can use the TEMPLATE utility to dynamically allocate some of these data sets.

Table 1. Data sets that LOAD uses
Data set Description Required?
SYSIN Input data set that contains the utility control statement. Yes
SYSPRINT Output data set for messages. Yes
STPRIN01 A data set that contains messages from the sort program (usually, SYSOUT or DUMMY).

This data set is used when statistics are collected on at least one data-partitioned secondary index.

No1
Input data set DB2 Z/os Load utility 使用The input data set that contains the data that is to be loaded. Specify its template or DD name with the INDDN option of the utility control statement. The default name is SYSREC. It must be a sequential data set that is readable by BSAM. The input file can be an HFS or zFS file, in which case use a template with the PATH option.DB2 Z/os Load utility 使用 Yes3, 14
Sort data sets Two temporary work data sets for sort input and sort output. Specify their DD or template names with the WORKDDN option of the utility control statement. The default DD name for sort input is SYSUT1. The default DD name for sort output is SORTOUT. Yes4, 5, 14
Mapping data set Work data set for mapping the identifier of a table row back to the input record that caused an error. Specify its template or DD name with the MAPDDN option of the utility control statement. The default DD name is SYSMAP. Yes4,6
UTPRINT Contains messages from the sort program (usually, SYSOUT or DUMMY). Yes
Discard data set A work data set that contains copies of records that are not loaded. It must be a sequential data set that is readable by BSAM. Specify its DD or template name with the DISCARDDN option of the utility control statement. The default DD name is SYSDISC. Yes 8, 14
Error data set Work data set for error processing. Specify its DD or template name with the ERRDDN option of the utility control statement. The default DD or template name is SYSERR. Yes
Copy data sets One to four output data sets that contain image copy data sets. Specify their DD or template names with the COPYDDN and RECOVERYDDN options of the utility control statement. No9
DB2 Z/os Load utility 使用FlashCopy® image copiesDB2 Z/os Load utility 使用 DB2 Z/os Load utility 使用For table space or index space level copies, a VSAM data set for the output FlashCopy image copy of each partition or piece.

For a partition level or piece level copy, a VSAM data set for the output FlashCopy image copy of the partition or piece.

DB2 Z/os Load utility 使用

DB2 Z/os Load utility 使用No13DB2 Z/os Load utility 使用
Sort work data sets Temporary data sets for sort input and output when sorting keys. If index build parallelism is used, the DD names have the form SWnnWKmm. If index build parallelism is not used, the DD names have the form SORTWKnn. Yes10,12
Sort work data sets Temporary data sets for sort input and output when collecting inline statistics on at least one data-partitioned secondary index.

The DD names have the form ST01WKnn.

No2,11,12
Note:
  1. STPRIN01 is required if statistics are being collected on at least one data-partitioned secondary index, but LOAD dynamically allocates the STPRIN01 data set if UTPRINT is allocated to SYSOUT.
  2. Required when collecting inline statistics on at least one data-partitioned secondary index.
  3. As an alternative to specifying an input data set, you can specify a cursor with the INCURSOR option.
  4. Required if referential constraints exist and ENFORCE(CONSTRAINTS) is specified (This option is the default).
  5. Used for tables with indexes.
  6. Required for discard processing when loading one or more tables that have unique indexes.
  7. Required if a sort is done.
  8. If you omit the DD statement for this data set, LOAD creates the data set with the same record format, record length, and block size as the input data set.
  9. Required for inline copies.
  10. Required if any indexes are to be built or if a sort is required for processing errors.
  11. If the DYNALLOC parm of the SORT program is not turned on, you need to allocate the data set. Otherwise, the sort program dynamically allocates the temporary data set.
  12. It is recommended that you use dynamic allocation by specifying SORTDEVT in the utility statement because dynamic allocation reduces the maintenance required of the utility job JCL.
  13. DB2 Z/os Load utility 使用Required if you specify either FLASHCOPY YES or FLASHCOPY CONSISTENT.DB2 Z/os Load utility 使用
  14. DB2 Z/os Load utility 使用If the SYSREC data set is on tape, and you use templates for the SYSUT1, SYSOUT, or SYSDISC data sets, include the SPACE parameter in the TEMPLATE utility control statements.DB2 Z/os Load utility 使用
The following object is named in the utility control statement and does not require a DD statement in the JCL:
Table
Table that is to be loaded. (If you want to load only one partition of a table, you must use the PART option in the control statement.)

Defining work data sets

Use the formulas and instructions in The following table to calculate the size of work data sets for LOAD. Each row in the table lists the DD name that is used to identify the data set and either formulas or instructions that you should use to determine the size of the data set. The key for the formulas is located at the bottom of the table.

Table 2. Size of work data sets for LOAD jobs
Work data set Size
SORTOUT max(f,e)
ST01WKnn
  • 2 ×(maximum record length × numcols × (count + 2) × number of indexes)
SYSDISC Same size as input data set
SYSERR e
SYSMAP
  • Simple table space for discard processing:
    • m
  • Partitioned or segmented table space without discard processing:
    • max(m,e)
SYSUT1
  • Simple table space:
    • max(k,e)
  • Partitioned or segmented table space:
    • max(k,e,m)
    If you specify an estimate of the number of keys with the SORTKEYS option:
    • max(f,e) for a simple table space
    • max(f,e,m) for a partitioned or segmented table space
Note:
variable
meaning
k
Key calculation
f
Foreign key calculation
m
Map calculation
e
Error calculation
max()
Maximum value of the specified calculations
numcols
Number of key columns to concatenate when you collect frequent values from the specified index
count
Number of frequent values that DB2® is to collect
maximum record length
Maximum record length of the SYSCOLDISTSTATS record that is processed when collecting frequency statistics (You can obtain this value from the RECLENGTH column in SYSTABLES.)
  • Calculating the key: k

    If a mix of data-partitioned secondary indexes and nonpartitioned indexes exists on the table that is being loaded or a foreign key exists that is exactly indexed by a data-partitioned secondary index, use this formula:

    max(longest index key + 15, longest foreign key + 15) * (number of extracted keys).

    Otherwise, use this formula:

    max(longest index key + 13, longest foreign key + 13) * (number of extracted keys).

    For nonpadded indexes, the length of the longest key means the maximum possible length of a key with all varying-length columns padded to their maximum lengths, plus 2 bytes for each varying-length column.

  • Calculating the number of extracted keys:
    1. Count 1 for each index.
    2. Count 1 for each foreign key that is not exactly indexed (that is, where foreign key and index definitions do not correspond identically).
    3. For each foreign key that is exactly indexed (that is, where foreign key and index definitions correspond identically):
      1. Count 0 for the first relationship in which the foreign key participates if the index is not a data-partitioned secondary index. Count 1 if the index is a data-partitioned secondary index.
      2. Count 1 for subsequent relationships in which the foreign key participates (if any).
    4. Multiply count by the number of rows that are to be loaded.
  • Calculating the foreign key: f

    If a mix of data-partitioned secondary indexes and nonpartitioned indexes exists on the table that is being loaded or a foreign key exists that is exactly indexed by a data-partitioned secondary index, use this formula:

    max(longest foreign key + 15) * (number of extracted keys)

    Otherwise, use this formula:

    max(longest foreign key + 13) * (number of extracted keys)

  • Calculating the map: m

    The data set must be large enough to accommodate one map entry (length = 21 bytes) per table row that is produced by the LOAD job.

  • Calculating the error: e

    The data set must be large enough to accommodate one error entry (length = 560 bytes) per defect that is detected by LOAD (for example, conversion errors, unique index violations, violations of referential constraints).

  • Calculating the number of possible defects:
    • For discard processing, if the discard limit is specified, the number of possible defects is equal to the discard limit.

      If the discard limit is the maximum, calculate the number of possible defects by using the following formula:

      number of input records +
      (number of unique indexes * number of extracted keys) +
      (number of relationships * number of extracted foreign keys)
    • For nondiscard processing, the data set is not required.

Allocating twice the space that is used by the input data sets is usually adequate for the sort work data sets. Two or three large SORTWKnn data sets are preferable to several small ones.

Sort work data sets cannot span volumes. Smaller volumes require more sort work data sets to sort the same amount of data; therefore, large volume sizes can reduce the number of needed sort work data sets. It is recommended that at least 1.2 times the amount of data to be sorted be provided in sort work data sets on disk.