Monday, 27 August 2012

DFSORT Tutorial

    • Other names that can be used to invoke SORT are ICEMAN,IERRCO00,IGHRCOOO and SYNCSORT.

    • Sorting can be done from ISPF prompt also. For this open the file to be sorted in edit mode. Then use the command 'SORT 1 22' to sort the file from 1st to 22nd characters.

    • DFSORT is invoked using JCL.

    Sample JCL:

    //SORTIN DD DSN=G1SG00AT.SORTIN1,DISP=SHR Input for sort
    //SORTINnn DD DSN=G1SG00AT.SORTIN2,DISP=SHR Input for merge
    //                 DISP=(NEW,CATLG,DELETE),
    //                 UNIT=SYSDA,
    //                 SPACE=(CYL,(2,2),RLSE),
    //                 DCB=(LRECL=80,BLKSIZE=0,RECFM=FB,DSORG=PS)
    //SORTOFnn DD DSN=G1SG00AT.SORTOUT, Output data sets
    //                 DISP=OLD
    //SORTXSUM DD DSN=G1SG00AT.SORTOUT, Records eliminated by SUM
    //                 DISP=OLD
    //SYSIN DD *<control statements>

    Following DDNAMES are required to code a DFSORT job.

    1. //JOBNAME JOB -->
      Signals the initiation of a job. JOB statement contains all normal parameters as in normal JCL.

    1. //STEPNAME EXEC -->
      Signals the initiation of a step. Tells OS which program is to be executed. For DFSORT the program to be executed is SORT.

    1. //SYSOUT DD -->
      Defines the output dataset for messages.

    1. //SORTIN -->
      Defines the input dataset that needs to be sorted. Input Could be sequential, extended sequential, PDSE member, PDS member or a VSAM dataset. Files can be concatenated if the RECFM is the same for all files. For variable length files the file with largest record length must be first.

    1. //SORTINxx -->
      Used for merge request input files . Up to 100 SORTINnn DD's may be specified. The 'nn' suffix can be any number in the range 00 thru 99; they may be skipped or used out of order. The files must already be sorted in the proper sequence for the MERGE.

    1. //SORTWknn -->
      From 1 to 100 sort work files can be allocated using the SORTWKnn DD statement. The 'nn' suffix can be any number in the range 00 thru 99.SORTWKnn DD's are not used for a MERGE or a copy operation, or if the DYNALLOC parmis used. Each SORTWKnn MUST be on one unit & one volume.

    1. //SORTOUT -->
      Defines the output dataset. It can be sequential, extended sequential data set ,a PDS member or a PDSE member.

    1. //SORTOFnn -->
      Output file for a SORT, MERGE, or copy function. Up to 100 SORTOFnn DD's may be specified. The 'nn' suffix can be any number in the range 00 through 99. They may be skipped or used out of order. The FNAMES/FILES parameter on the OUTFILE control statement specifies the DD names of the output files and the output records that are written to each SORTOFnn data set.

    1. //SORTXSUM -->
      Output file for a SORT or MERGE function. The records eliminated by SUM processing are written to the SORTXSUM DD.

    1. //SYSIN DD * -->
      Defines the sort cards.

    • Sorting data:

    Sorting means Rearranging data in either ascending or descending order.

    SORT FIELDS=(Starting position of field, Length of the field, datatype, Ascending or Descending order)

    SORT FIELDS=(pos1,len1,type1,opt1,pos2,len2,type2,opt2,...)
         {,SIZE=n | En}
         {,FILSZ=n | eN}
         {,DYNALLOC=(OFF) | (d,n){,RETRY=(OFF | (nn,mm)}}
         {,SKIPREC=n}  --> skip these many records and then start the sort operation
         {,STOPAFT=n}  --> Stop after these records
         {,EQUALS | NOEQUALS}
         {,CKPT | CHKPT}{,CENTWIN={0 | s | f}

    Determines whether or not the sort will preserve the order of records with identical sort fields.
    NOEQUALS is the default, and causes equal-keyed records to bewritten in a random order, not in the order they were encountered.


    SORT FIELDS=(16,15,CH,A) --> Sorting on single field(in ascending order)

    SORT FIELDS=(16,15,CH,A,31,15,CH,A) --> Sorting on multiple Fields.

    We can also write the above statement as
    SORT FIELDS=(16,15,A,31,15,A),FORMAT=CH --> Use FORMAT if the format of all fields being sorted is same.

    Following are the available data formats:
    1. EBCDIC character: CH
    1. ASCII character: AC
    1. BINARY : BI (COMP fields)
    1. ZONED Decimal: ZD (numeric fields with USAGE as display)
    1. Packed Decimal : PD (COMP-3)
    1. Alternate Sequence: AQ

    • Merging datasets:

    Merge two or more datasets to form a single dataset.
    Requirements of merging two datasets is:
    1. Datasets should have same LRECL and RECFM
    1. Datasets should be sorted prior to Merge.

    Remember that merging is different from concatenating. Concatenating means data is just appended back to back. With merging sort order is maintained.

    Merge Syntax is :



    MERGE FIELDS=(16,15,A,31,15,A),FORMAT=CH

    The sort order 'A' indicates the order in which the data is already sorted in the input files.

    When using Merge do not use SORTWKnn DD statement.
    Instead of SORTIN DD we use SORTINnn DD. One SORTINnn statement for each of the input file. Nn stands for 01 to 16.

    • Processing Order of control statements:
      1. INCLUDE /OMIT
      1. INREC Statement
      2. SORT/MERGE
      3. OUTREC Statement

    • Copying datasets:

    The SORT/MERGE control statement with FIELDS=COPY defines the application as a COPY application.


    Note that with sort, merge we should always use FIELDS= (even of we are just requesting a copy application)

    Following is the format:

    SKIPREC indicates the number of records to be skipped.(With OUTFIL equivalent is STARTREC. Note that STARTREC can be used only with OUTFIL)

    STOPAFT indicates the number of records to process. (With OUTFIL equivalent is ENDREC. Note that ENDREC can be used only with OUTFIL)

    CKPT/CHKPT indicates a checkpoint is desired at the end of volume of a SORTOUT data set when OUTFIL is not used.

    Note that we could also code OPTION COPY to perform the copy application

    • INCLUDE Statement:

    Used to establish selection criteria for records to be included in the output dataset.

    We can compare the contents of its fields to a constant or another field in the record.

    Include Statement cannot appear with OMIT statement on the same SORT operation.

    Format Of include:

    INCLUDE COND={({c1,{AND | OR},c2},...){,FORMAT=x} }

    • COND=ALL means all input records are to be included.

    • COND=NONE means none of the input records are to be included.

    • AND may be specified as &. OR may be specified as ¦.
    The AND and OR relational operators are used to link together as many repetitions of the 'expression' field as desired.


    SORT FIELDS=(16,15,CH,A)
    INCLUDE COND=(16,15,CH,EQ,31,15,CH) --> comparing with other field in the record

    INCLUDE COND=(46,10,BI,GT,2000,OR,46,10,BI,LT,10000) --> Comparing Against constants

    INCLUDE COND=((10,3,CH,EQ,C'REV'),AND,(67,2,CH,NE,C'GA')) --> Using logical operator AND
    The INCLUDE above will only pass through records with the character string 'REV' inposition 10 and the state abbreviation 'GA' in position 67.

    INCLUDE COND=((21,1,BI,EQ,X'22'),OR,(21,1,BI,EQ,X'47'),OR,(21,1,BI,EQ,X'F3')) --> Using logical operator OR.

    • OMIT statement:
    Used to set up selection criteria for records to be omitted from the output dataset.

    We can omit a record by comparing contents of its fields to a constant or another field in the record.

    Omit Statement cannot appear with INCLUDE on the same SORT execution.

    Format of OMIT:

    OMIT COND={({c1,{AND | OR},c2},...){,FORMAT=x} }


    OMIT COND=(16,15,CH,EQ,31,15,CH)

    OMIT COND=((10,3,CH,EQ,C'REV'),AND,(67,2,CH,NE,C'GA')) OR
    OMIT COND=((10,3,EQ,C'REV'),AND,(67,2,NE,C'GA')),FORMAT=CH

    Comparison Operators that can be used with INCLUDE and OMIT are EQ,NE,GT,GE,LE,LT

    • SUMming fields:

    The SUM control statement deletes records with equal control fields and optionally summarizes specified numeric fields on those records. If numeric fields are to be summarized, the data in the summary fields is added, the sum is placed in one of the records, and the other record is deleted.

    The format of the SUM statement is:
        {FIELDS={NONE} }
    SUM {FIELDS=(p1,l1,f1 {,p2,l2,f2) ... ) }
        {FIELDS=(p1,l1 {,p2,l2) ... ),FORMAT=f } {,XSUM}

    Where XSUM means the dropped records are written in the dataset specified by SORTXSUM data set.

    SORT FIELDS=(31,15,,CH,A)
    SUM FIELDS=(46,10,BI)

    The result will contain a Single row which is top mostrow of the Sorted fields including the condition mentioned in INCLUDE Statement, and the Price Fields will contain the Sum of prices of all books whose Author is MIKE.

    • Eliminating duplicates on a field:


    We could use XSUM if we want to save off the remaining records.


    In this case we must have a additional DD called SORTXSUM. This is where the discards will go.

    • Reformatting output:

    We can perform reformatting using INREC,OUTREC and OUTFIL.

    We can use INREC,OUTREC,OUTFIL to
    1. Delete fields
    2. Reorder fields
    3. Insert strings or blanks
    1. Convert data to hexadecimal representation.
    1. Convert case.
    2. Edit numeric  values
    3. Convert numeric values to another format.
    4. Perform arithmetic operations.
    5. Insert sequence number.
    6. Change record Length.

    INREC: Formatting will be done before sorting , merging or copying of the records is done. Hence we will refer the original file layout when using INREC. In short operation will be performed on the input file.

    Use of INREC improves the sort performance by reducing the number of bytes that must be processed.

    The format of the INREC statement is INREC FIELDS=(...).

    The FIELDS parameter simply identifies the fields that should be processed.

    A sample INREC statement is as follows:
    INREC FIELDS=(1:1,20,21:40,15,ZD,PD,29:60,5)

    In this statement, 3 data fields are specified as follows:

    • The first field begins in byte 1 of the input record and is 20 bytes long.
    It will be in position 1 of the output record.

    • The second field begins in byte 40 of the input record and is a 15-byte ZD field.
    The field will be converted to PD. It will be in position 21 of the output record.

    • The third field begins in byte 60 of the input record and is 5 bytes long.
    It will be in position 29 of the output record.

    IN the above example we used absolute positioning using ":". We can also just specify the position in input file and it will create fields back to back.

    OUTREC: Formatting will be done after sorting or merging is done. In short Operation will be performed on the output file.

    • Delete or repeat segments of the input records.
    • Insert character strings between data fields.
    • Insert binary zeros.
    • Create a sequence number field.
    • Convert numeric data to printable format or to another numeric data format.
    • Perform arithmetic operations (multiply, divide, add, subtract) and minimum and maximum functions with numeric fields and constants.
    • Convert data to printable hexadecimal format.Select, realign, and reorder data fields.
    • Convert a variable length record input file to a fixed length record output file.

    OUTFIL: The OUTFIL control statement describes the output file or files.

    OUTFIL is used for following purposes:

    • Create multiple output files:

    • Use the Sort Writer facility:
    This task uses these parameters: HEADER1, HEADER2,LINES, NODETAIL, SECTIONS, TRAILER1, TRAILER2.

    • Reformatting records: This task uses these parameters: OUTREC.

    The format of the OUTFIL statement is:

    OUTFIL {FILES={fileid}{(fileid1 {fileid2}...}  --> Specifies fileid's . This is used when the output DDNAMES are of type SORTOFnn. The number nn will be used as fileid.
           {,FNAMES={ddname}{(ddname1 {ddname2} ... )} --> Here we specify the actual ddname. This will be used if we don't use the names of output files as SORTOFnn.
           {ALL}{NONE}{ {,AND,} }
           {,INCLUDE={(c1,{,&, } C2... } -->  specifies the include criteria to select records to be copied to a file specified by FILES or FNAMEs
           {,OMIT= { {,OR, } }{ {,¦, } } --> Specified the omit criteria to skip the records from being copied to a file specified by FILES or FNAMES
           {,STARTREC=n} --> Indicates the record number of start working for output file specified by FNAMES or FILES
           {,ENDREC=n} --> Indicates the record number of stop working for output file specified by FNAMES or FILES
           {,SPLIT} --> Split data from input equally among output files
           {,OUTREC=(field1, {,field2} ... )} --> similar to ourrec processing. Note that on OUTFIL we use OUTREC and not INREC.
           {,HEADER1=(field1, {,field2} ... )} --> used for reporting purpose.
           {,HEADER2=(field1, {,field2} ... )}
           {,TRAILER1=(field1, {,field2} ... )}
           {,TRAILER2=(field1, {,field2} ... )}
           {,SECTIONS=(field1, {,field2} ... )}
           {,LINES={n | ANSI | (ANSI,n)}}{,NODETAIL}


    1. Deleting a field:
    OUTREC FIELDS=(16,15,31,15)

    This will Create a output file without the 1st field(1-15characters)

    1. Reordering the fields:
    OUTREC FIELDS=(31,15,16,15)

    Thus the fields will be reordered.

    1. Insert a new field with zeros:

    OUTREC FIELDS=(31,15,16,15,4,4Z)

    1. Inserting Blanks:

    OUTREC FIELDS=(20X,31,15,16X,31,15)

    This will put 20 blank fields before 1st field and 16 blank spaces before 2nd field.

    1. Creating multiple copies of same dataset:


    OUTPUT1,OUTPUT2,OUTPUT3 are the 3 ddnames in the JCL.

    1. Selecting records to different files with STARTREC and ENDREC options


    STARTREC,ENDREC are used only with OUTFIL processing. When not using OUTFIL we can use STOPAFT.

    1. Separating records to different files based upon record contents:


    Note that we don’t write COND when using INCLUDE on the OUTFIL.

    1. BUILD command:

    Build command can us be used with OUTREC and INREC to recreate the record layout. OVERLAY can be used to keep the existing layout same and add/replace some data column.


    SEQNUM can be used to add sequence numbers to rows.

    1. SPLIT

    Suppose we have a file and we don’t know the number of records and we want to divide them equally among 3 files then we could use the SPLIT option as shown below:


    Above example will create 3 files each of 1/3rd  size of the  original file.

    • Editing Numeric values:

    There are Around 27 Predefined Edit masks named M0 to M26, which can be used to convert non Edited data in to the Edited form.

    We can also manually create edited forms:


    SORT FIELDS=(16,15,CH,A)
    OUTREC FIELDS=(5X,16,15,5X,46,8,BI,EDIT=(SIII,III,TT.TT),SIGNS=(,-))

    T:- if there is no value or it is a zero then Zero will be printed Else if there is Some value that value will be printed.
    e.g if EDIT=(TTTT) is give and value is 4 then it will be printed as 0004 in the report.

    I:-If there is no value Zero will not be inserted
    e.g  if EDIT=(IIII) is given and if the value in input file Is 0004 then it will be printed as 4.

    S:-indicated the Sign.
    SIGNS=(,-)  Will Print ±if the input value is negative and does space when it is Positive.