Operations Automation
Part Two - Challenge #13

Background:

Enter the following to begin this challenge -
tso submit jcl(job13a)

Automation of operational procedures can be accomplished with software products or available commands accompanied by a small bit of programming.

Large Enterprise computing operations are highly automated. While the environment runs on autopilot, if a problem occurs beyond the scope of automated operations, then the technical staff must quickly react to identify and resolve the problem.

A large computing environment infrastructure is constantly changing as a result of upgrades to hardware, software, network, and business applications. Automated procedures require adjustments to accommodate those changes. Changes to the environment can result in a problem automation is unable to handle requiring operations or systems technical staff to identify and resolve the problem.

Therefore, like a fireman or an aircraft pilot, an operator should spend time reviewing the automated procedures. The more the operator learns about the underlying technology, the better equipped the operator is to manage unexpected problems or even find an opportunity to contribute additional automation.

Automation related to security is receiving significant attention as a result of web enablement of critical systems and data that was not previously available to the general public. When someone or something is continously looking for open doors to access critical data that does not belong to them, then action is needed. A couple of decades ago, those doors to the critical data did not exist. Even when the doors are closed, those doors may not be sufficiently locked. The issue is data must pass between the highly fortified back end systems and the general public. While authentication, encryption, and other fortifying technology is applied, the criminally minded are constantly probing for weaknesses to exploit.

Some highly secure computer systems are without network connectivity or the network is only able to read while updates are not possible or the network is only able to send updates while read of existing updates is not possible. Many large enterprise IBM mainframe environments must enable both reading and writing to serve the general public. The IBM mainframe has achieved the highest level of security for such an environment. If you are curious about computing security levels, internet search for "Common Criteria", "Evaluation Assurance Level, EAL", "Multi-Level Security, MLS", or "Trusted Operating System". IBM z Systems hardware running z/OS has achieved EAL 5+ rating which is the best rating achieved by a publicly accessible general purpose computing environment. An example of a computer system rated EAL 6 is where the operator of the computing system is a military pilot flying the most advanced aircraft.

While IBM z Systems running z/OS has EAL 5+ capabilities, requirement to implement and maintain security best practices is required by the organization.

Many IBM z Systems z/OS environments are tightly locked down which only enables designated users to access only what they are explicitly permitted to access. The result is the inability to learn about and explore z/OS capabilities to develop z/OS skills. Your participation in the Master the Mainframe contest provides access to capabilities unavailable to many internal technical staff members within a large enterprise operation.

Operations technical staff are trusted staff members. Operations technical staff will have access to capabilities necessary for their job responsibilities that are not available to application development and other Information Technology staff. Operations is a great place to gain the big picture of the organization.

Challenge Scenario

You are a trusted member of the operations staff. Security and systems programming staff collaborated about the potential need for additional firewall rules based upon suspected unauthorized external probing of the system. You are instructed to collect data when a specific symptom occurs to provide the security and systems programming staff with immediate facts about the environment that may potentially prove or disprove technical staff theory that would assist with effective firewall adjustments.

Do the following because as a trusted operator you are asked to:

  1. Use SDSF SYSLOG facility
    =sd ; log
  2. Enter the following MVS command
    /d net,csmuse,pool=4khcom
      write down detail total for 4KHCOM pool
  3. Enter the following command from either SDSF or ISPF panel
    tso netstat stats
      write down values for:
        - IPV4 Packets Received
        - IPV4 Received Header Errors
        - IPV4 Received Address Errors
        - IPV4 Received Packets Discarded
        - IPV4 ICMP Messages Received
        - IPV4 ICMP Messages Sent
        - Current Established Connections
        - Current Stalled Connections
        - Current Servers in Connection Flood
    Hint: You can copy the entire output of the command, for example, with Vista TN3270 you can use a combination of Edit pulldown Copy followed with repeating Copy Functions > Copy Append to place the entire output into your clipboard.
  4. Wait several minutes, then
        - record second set of values using above 1, 2, and 3
        - record differences in values

Note: MVS command /d tcpip,,netstat,stats produces the same output as tso netstat stats.

Challenge:

The operational symptoms described by security and systems programming staff are occurring several times a day resulting in the repeated labor intensive task to collect requested data. Operations staff decided to automate the above procedure. However, the automated procedure has a syntax error that must be corrected to work.

You previously submitted JCL(JOB13A) which accomplished 1, 2, and 3 outlined in the above challenge scenario. JCL(JOB13A) wrote output from first execution of commands into hlq.CH13.OUTPUT(RUN1).

JCL(JOB13B) writes output from second execution of commands into hlq.CH13.OUTPUT(RUN2). A REXX routine reports the differences in values of interest to security and systems progamming staff.

The REXX routine does work properly. If you care to view the REXX routine, browse 'b' ZOS.MTM2017.PUBLIC.CLIST(CH13).

Big hint: The REXX routine assumes correct command syntax producing command output. Review hlq.CH13.OUTPUT(RUN2) after submitting JOB13B for more useful problem symptoms.

Enter the following -
tso submit jcl(job13b)

Use SDSF to review JOB13B output. Use ? to the left of JOB13B and select s DDNAME SYSTSPRT with associated StepName REPORT.

Observe SYSTSPRT output fails to have values for the second execution of the commands. Example, variable 'R2.4KHCOM.POOL' is in the report where the value for 4khcom pool should be. The REXX routine uses variable R2.4KCOM.POOL to store and report the value. Other REXX routine variables begin with R2.. If the output contains literals that begin with R2., then something is still incorrect. REXX routine is NOT the problem. The REXX routine is reading hlq.CH13.OUTPUT(RUN2) and the content in RUN2 is incorrect as a result of an error in JOB13B.

JCL(JOB13B) is creating hlq.CH13.OUTPUT(RUN2). Therefore, a correction to JCL(JOB13B) will produce a good report.

Once a good report is produced, then write the report to hlq.P2.OUTPUT(#13) as follows:

  • Enter xdc to the left of DDNAME SYSTSPRT associated with StepName REPORT as follows:

  • Another panel is displayed enabling member #13 to be written to p2.output as follows:
    BLKS in the Space units field is required. However, value below Space units (BLKS) will be ignored of Disposition shr. In screen shot below, all attributes in red must be specified and then press Enter

Successful completion will be a valid report in hlq.P2.OUTPUT(#13)

Next: Challenge #14