9 Steps to Agile Mainframe Ops - NEDB2UG Steps to... · 2019-10-04 · 9 Steps to Agile Mainframe...

Preview:

Citation preview

1

9 Steps to Agile Mainframe Ops Getting Maximum Long-term Value from Your Core Systems

Keith SissonDave SeibertSeptember 2019

2

DevOps?

“DevOps is not a technology challenge. Although technology plays a part, DevOps is largely about people, culture and process.”

3

Why Now?

I. Mainframes are critically important

II. Development teams are moving faster

III. Big changes require time

4

Time to Market

5

Digital Age = Fast Beats Slow

5

#1 Concern is being caught flatfooted by competitors AND not being Able to RESPOND

6

Waterfall

TIME

Planning Design Development QAPlanning Design Development QA

Ops First Look

TIME

7

Planning Design Development QAQA

Development

Design

Planning

QA

Development

Design

Planning

QA

Development

Design

Planning

QA

Development

Design

Planning

QA

Development

Design

Planning

Waterfall

TIME

demo demo demo demo

Agile

8

What about us?

Understand dev part of DevOps … What about ops?

Where do we start?

• Concrete steps

• Reasoning

• Success Indicators

9

1. Mainframe stewardship

2. Understand mainframe in digital age

3. Seamless integration

4. Culture of cooperation

5. Success metrics

6. Process improvement

7. Problem resolution

8. Automation

9. Promote your success

Mainframe DevOps From an Ops Perspective

10

MAINFRAME STEWARDSHIP

Step 1:

11

Ignoring Situation Usually Doesn’t Make It Better

• Mainframe is here to stay

• Need people to support platform

• Need explicit executive support

12

John Deere Combine

1313

14

Five Levels of Onboarding

3. Love/Belonging• Working on big important projects• Explore new techniques• Collaboration with other teams

4. Esteem/Confidence• Training• Creating automation• Solving big problems

2. SafetyJob security and career advancement

5. Self ActualizationRealization of your full potential

1. Physical NeedsGood salary, nice place to work

15

Maslow’s Hierarchy of NeedsSelf Actualization

Encourage them to make impact in system and in organization

Encourage Initiatives

• Open-minded

• Responsibility

• Lead projects

• Change organizational systems

• Improve own working environment

Multi-disciplinary Training

• Self learning

• Training in other fields

• Certifications

16

“Business leaders are broadly:

Type 1: Passionate product innovators

Type 2: Passionate about making the company run well—process oriented.”

- Ben Horowitz

“Software is eating the world, in allsectors. In the future every company will become a software company.”

- Mark Andreessen

17

1. Exploiters: Recognize mainframe as valuable asset

2. Status Quo: Mainframe must go slow

3. Cloud Only: Get off mainframe

Camps of Thought Around Mainframe

Mainstream the mainframe

18

• Defined commitment to the mainframe

• Mission champions and allies are found

• Hire new people onto mainframe teams

Success Indicators

19

UNDERSTAND THE MAINFRAME IN THE DIGITAL AGE

Step 2:

20

1964 Chevy Corvette

21

2019 Chevy Corvette ZR1

22

The Mainframe Simply Works

• People don’t really understand mainframe’s power

• Sometimes overlooked precisely because it always works

2323

Legacy technology

No new developersto support platform

Can’t be Agile

Running code from

1960s

Expensive to operate

24

25

26

UK Bank Success Story

• Each developer delivers 36 story points per release, as opposed to 8

• Reduced time spent writing and checking code during each sprint from average of 40 hours to 7 hours

• Project delivery timeframe cut by 1/3

• Reduced time to test large applications from 2 weeks to 5 minutes

• Removed manual effort and enabled creation of repeatable tasks in Jenkins

• Overall developer productivity increased by 400%

26

“We are innovating faster on the mainframe than we are in our digital channels, which is just incredible”

27

• Stakeholder confirmation and understanding

• Champions and allies identified and ready to help

• Explicit confirmation of the understanding of the mission and its importance to the success of the business

Success Indicators

28

SEAMLESS INTEGRATION

Step 3:

29

Two-platform IT

• Enterprise platform strategy we’ve employed for 4+ years

• Helped us save $5 million annually

• Focus on continuously improving customer experience,

instead of demonizing particular platform

• Most mission-critical workloads

and strategic IP leverage mainframe

• Compulsory apps not providing competitive

advantage are server-based

• Compulsory apps became SaaS, cloud-consumed service

30

Our Data Center

31

IBM Commercial (2006)

32

• Moving on-prem x86 to cloud SaaS saves $

• Re-writing or re-platforming does not save $

• Migrating strategic IP to cloud costs millions, takes years, introduces new risks and results in failure or reduced service levels

• Even if successful, only at same place you started!

• Executive sponsorship

• Business unit buy in and co-lead

Two-platform IT

33

REST Web Service Server

REST Web Service Server

HTTP Response

HTTP Request

{"name":"Product",

"properties":

{ "id":

{ “type":“3090",

"description":“hammer",

"required":true

},

"name":

{ "description":“deluxe driver",

"type":“implement",

"required":true

},

"price":

{ "type":“11.49",

"minimum":0,

"required":true

},

"tags":

{ "type":"array",

"items":

{

"type":“red"

}

}

}}

JSON Object

REST Web Service Client

Create HTTP PUT

Read HTTP GET

Update HTTP POST

Delete HTTP DELETE

REST Web Service Server

RESTful API

34

Accepted standard for what “cloud service” is:

• On-demand self-service

• Broad network access

• Resource pooling

• Rapid elasticity

• Measured service

*NIST publication SP800-145

Source: http://www.redbooks.ibm.com/redbooks/pdfs/sg248347.pdf

• Distributed caching solution

• Unique identifier generator

• Crypto service

• No SQL data store

All accomplished on their mainframe!

35

Native CICS APIs

CICS APIs that satisfy requirement for HTTP-based communications for most cloud services:

EXEC CICS WEB EXTRACT

Gets HTTP information about incoming requests

EXEC CICS WEB RECEIVE

Gets request information from consumers

EXEC CICS WEB SEND

Sends response to consumer

Also useful APIs:

EXEC CICS WEB READ HTTPHEADER

Gets HTTP header fields and information

EXEC CICS WEB WRITE HTTPHEADER

Creates HTTP header fields and information

REST APIs

36

//O&number JOB (TECH) //*

// EXPORT SYMLIST=ORDER

// SET ORDER=‘&number//*

//PROCSTEP EXEC PRODUCTP

//*

//SYMBOLIC.SYSUT1 DD *

Description &descriptionType &type Price &price

JSON to JCL

{"name":"Product","properties": {"Order":{"Number":"0001405","required":true},

"id":{“id":"3090","description":"hammer","required":true},

"name":{"description":"deluxe",“type":"implement","required":true},

"price":{"price":"11.49","minimum":0,"required":true}

}}

//O0001405 JOB (TECH) //*

// EXPORT SYMLIST=ORDER

// SET ORDER='0001405' //*

//PROCSTEP EXEC PRODUCTP

//*

//SYMBOLIC.SYSUT1 DD *

Description hammerType implement Price 11.49

37

Services vs. Functions vs. IP

Services = Generic operations that provide valuable utility Example: Generating unique value or

temporarily storing data

Functions = Provide specific operation or result Example: Date difference calculation, string manipulation,

data/numeric conversions

IP = Code that uses logic, design and creativity to produce results that often provide strategic value

Q. What should I run on the mainframe?

A. ALL OF IT! (*anything with a lot of transactions)

38

How to Do Two-platform IT

Step 1: Conduct Audit

• Analyze all apps running

• Separate compulsory from business-critical

• Business-critical stay on mainframe

Step 3: Have Business Units Make Functional Decisions

• Elect subject matter expert (SME) to understand business unit needs

• Partner IT with SME and vet process to ensure right decisions are made

Step 2: Identify Risk

• Impact from failure

❑ Cost of failure?

❑ Who is impacted?

❑ Workarounds available?

Step 4: Decide What to Do with Data from Deprecated Technology

Have to make it accessible in some way

• Port it into new system

• Archive it in some new system

• Keep old system (eliminates some savings)

39

Step 6: Roll Out

• Train everyone on how to use new system once up and running

• Burn the boats

Step 7: Reallocate IT and Other Business Resources

• Not cost-cutting exercise

• Re-allocate people and resources

Step 5: Conduct IntensiveAudit on Data Quality

• Begin to uncover hidden constraints

• Benefit: Data is ultimately scrubbed = fresh start

How to Do Two-platform IT

40

Success Indicators

• Realization that big things can be accomplished

• No longer have groups bigoted towards certain platforms

• Stop wasting time moving workload for no real value

• Focus shifts to innovation

41

INITIATE A CULTURE OF COLLABORATION

Step 4:

42

Zero Tolerance for Silos

You can no longer afford to manage your mainframe environment as an isolated silo apart from your cross-platform enterprise.

Normalization will reduce your costs, improve your outcomes and extend the useful life of your mainframe for decades to come.

43

Agile Input

Three AmigosProduct Council

Sprint Planning and Estimation

Sprint Review and Demo

Retrospective

Sprint Cycles

44

Three Amigos

Operations

Bringing new perspectives

Product Owner

Developer

Customer

DBA Team

Marketing

45

Agile Methodology Kanban

Backlog Accepted

Next to

WIP

WIP

Next to Accepted

46

• Test automation

• Packaging

• Security

• Performance tests/benchmarking

• Build automation

• Deployment updates

• Storage demands

• Capacity

Development Release Backlog

47

Agile Methodology Kanban

Backlog Accepted WIP

48

Success Indicators

• Team problem-solving without finger-pointing

• Increased use of shared digital tools

• Tangible increases in collaborative ‘face-to-face’ meetings

• Shared and/or aligned KPIs

49

SUCCESS METRICS

Step 5:

50

KPI Success Factors

• Communicate KPIs clearly to team: Won’t affect much change if people don’t explicitly know how they are being measured

• Velocity and efficiency are as important as quality, place equal importance on measuring and continuously improving all 3

• Set small number of initial goals that are challenging within current processes but achievable by embracing new methods

51

Don’t Be Stuck in “Department of No” Mode

• Don’t solely be centered around “five 9s” availability and tight time-windows for batch completion.

• What is most important to you?

• What is your biggest constraint to moving faster?

52

SWOT Chart

Strengths Weaknesses

OpportunitiesThreats

53

Things take too long.

When will the new “X” be ready?

Measure the elapsed time from

“Go” to “Done”

Velocity

54

High priority work takes a backseat to emergencies!

It’s hard to anticipate emergencies, because they are, by definition, unplanned work

People utilized at 100 percent do not have the capacity to handle unplanned or urgent requests without dropping other high-priority work

WIP

55

Your top priorities may not alignwith other team’s top priorities

When a decision is made to do one thing,

you are delaying something else

People may just be unaware that there will

be an impact

Alignment

56

The process to get things approved slows down everything

Process Efficiency

Change control, ITIL, CMM, Quality Gate, etc

57

We have to put on maintenance, install a new release, defrag DASD, improve performance…

Technical Debt will eventually catch up to you.

Melvin Conway’s 2nd law: “There is never enough time to do something right, but there is always enough time to do it over”

Technical Debt

58

Measure and track the “Flow Rate”:

Where was the time spent from “Go” to “Done”?

Categorize and measure the time that is spent on unplanned work

Categorize and measure the time that is spent on high priorities of others

Measure the time that is spent waiting on process

https://flowframework.org Dr. Mik Kersten

Big

Thre

e

59

zAdviser

60

Success Indicators

• Basic understanding of flow roadblocks

• Improvement ideas, especially when waiting

• Roadmap created for successive rounds of process improvements

61

CONTINUOUS, QUANTIFIED PROCESS IMPROVEMENT

Step 6:

62

• Change control

• Compliance

• Security

• Audit

• Performance tests/benchmarking

• Performance monitoring

• Customer/user support

• Test automation

• Packaging

• Build automation

• Automatic deployment

• Automatic back-out

• Deployment updates

• Training

• Marketing

• Legal

DevOps

63

Insert Existing Processes into the Flow

Three AmigosProduct Council

Sprint Planning and Estimation

Sprint Review and Demo

Retrospective

Sprint CyclesChange Control

Customer Support

Automation

Performance Monitoring

Long-range Planning

Auditing/ComplianceOperations

64

DevOps

Planning

Design

Development

QA

PO

Operations

Dev

QA

TIME

Systems

Network

Customer Success

Security/Audit

Data

65

10 Items to Consider around Process1. Continuously examine your processes and eliminate

meaningless or low value ones

2. Create similar processes for all of development

3. Align approval and vetting processes with your Agile development schedule

4. Accomplish as much as you can using collaboration tools like Confluence, Slack or HipChat

5. Operational change control should be easier

6. Automate testing process

7. Automate deployment process

8. Have reliable method to verify code that was tested is same code that was deployed in production

9. Be able to quickly and automatically back out failed deployments

10. Training and constantly reinforce

66

Success Indicators

• 3-5 KPIs that align with improving quality, velocity and efficiency

• New incentives that support KPIs

• Eliminate incentives that are counterproductive

• Get explicit feedback

67

PROBLEM RESOLUTION“SHIFT LEFT”

Step 7:

68

Business Unit Wants It Fast

99.999 SLAAutomated Continuous Delivery

99.999 goal

FASTSLOW FASTSLOW

Speed and Coordination

69

Continuous Deployment

Continuous Delivery

Continuous Integration

Development

1. Analyze, edit and compile code

2. Unit test

Automation• Ideation• Product management

3. Code quality review

4. Product build

5. Integration test

6. Acceptance test

7. Deploy to production

70

71

AutoPLEX

DeployV16 | V17

Test Assets

Hiperstation Batch Jobs

V16 | V17 | V18

V16 | V17

V16 | V17

V16 | V17

V16 | V17

ZAPI

Test Results

Emailwith URL

CW01

V16 | V17

72

CW13/CW14 ISPW Promote/Deploy Process

Promote to PROD

Promote to REGR

Promote to INTG

Promote to PTFS

– Deployed to

Regression

libraries

– Automated

tests run with

product REGR

libraries

– Deployed to

Integration

libraries

– Automated

tests run with

product INTG

libraries

– Approved PTFs

auto applied

to targets

– Automated

tests run with

product SMP/e

target libraries

– Automated

SMP/e product

build-install

– Automated tests

run against all

installed product

builds

73

74

75

76

77

78

Success Indicators

• Measurable shift of QA tasks to earlier in software lifecycle

• Reduced occurrence of specific issues and alert types later in lifecycle

• Fewer customer-discovered defects

79

INCREASE AUTOMATION

Step 8:

80

• Primary focus = eliminating line separating development and operations/system programming efforts

• Enable functions formerly considered domain of operations or systems personnel and extend to dev teams with auditable, secure means

• Will not discuss integration of application database design changes into DevOps methodology

How Db2 Fits into Our DevOps Strategy

81

• Db2 tables contain our repository of PTF metadata

• Interfaces to Jira software for requirements and problem reports

• Use queries to build:

– Lists of PTFs that are customer requests

– Lists of PTFs that customers do not have installed

• Recursive Rexx and SQL

– When customer requests 1+ PTFs, our programs retrieve available prerequisite PTFs for requested PTFs

Db2 in Receive Order

82

• WITH PREREQPTF (PTF, PREREQ ) AS

• (

• SELECT PRE.PTF_ID, PRE.PREREQ_PTF_ID

• FROM PTF_PREREQ PRE

• WHERE PRE.PTF_ID = 'SBG009A'

• UNION ALL

• SELECT CHILD.PTF_ID, CHILD.PREREQ_PTF_ID

• FROM PREREQPTF PARENT, PTF_PREREQ CHILD

• WHERE PARENT.PREREQ =CHILD.PTF_ID

• )

• SELECT DISTINCT PTF, PREREQ

• FROM PREREQPTF

• ORDER BY PTF, PREREQ

Recursive SQL to get pre-reqs

83

• Do until PreReqs_done•

• l = l+1 • if ptot= ptfs.0 then resultlist= currPTFList• SQLSTMT = , • " SELECT DISTINCT PREREQ_PTF_ID ", • " FROM "creator".PTF_PREREQ ", • " WHERE PTF_ID IN ", • " ( " , • currPTFList, • " ) ", • " AND PREREQ_PTF_ID NOT IN ( ", • resultList ")" •

• "EXECSQL DECLARE C1 CURSOR FOR S1" • "EXECSQL PREPARE S1 FROM :SQLSTMT" • if sqlcode <> 0 then /* & sqlcode <> 347 then */ • do;DB2Function = 'Prepare Statement'; call Display_Sqlca;exit 12;end • "EXECSQL OPEN C1" • if sqlcode <> 0 then • do;DB2Function = 'Open cursor'; call Display_Sqlca;exit 12;end • do i= 1 to 9999 until SQLCODE <> 0 • "EXECSQL FETCH C1 INTO :PTFID" • select • when sqlcode = +100 & i=1 then • if ptot>0 then

Recursive Rexx & SQL

84

• Receive from network uses same Db2 metadata tables as Receive Order

• Receive from network typically retrieves entire product and all maintenance

• Our experience: Used less often for maintenance only; handled through receive order

Db2 in Receive from Network

85

• Db2 tables of nightly backup data for critical resources

– Dsname

– Volume serial

– Backup date

– Backup type

• Merge data from VTOC and Catalog information

• 24 million row table

• Replaced SAS database with Db2

• Tables queried to prep for our DR tests

• Tables also queried to build JCL for emergency DS recovery

Db2 in Disaster Recovery

86

SQL to populate DR main table

87

Before & after SQL

Having altered the logging attribute, we need an image copy to make the table recoverable.

88

WITH BACKUPS AS

(SELECT BACKUP_DATE,BACKUP_TYPE,COUNT(*) COUNTS

FROM DR.RECOV_ALL_REPORT

GROUP BY BACKUP_DATE , BACKUP_TYPE

)

SELECT COUNT(*), BACKUP_TYPE, ' '

FROM BACKUPS

GROUP BY BACKUP_TYPE

UNION ALL

SELECT * FROM

(SELECT COUNTS,BACKUP_DATE,BACKUP_TYPE

FROM BACKUPS

ORDER BY BACKUP_TYPE,DATE(BACKUP_DATE) DESC)

UNION ALL

SELECT SUM(COUNTS) ,' ',' '

FROM BACKUPS

-- S.B CW01 D=003 W=006 M=036

-- FROM DISASTER.MASTER.FILE(GDG)

Common Table Expression to query the DR table

This is a query I created to answer three questions I have of the DR data.

I want to avoid traversing the 24 million rows 3 times. I found (finally) a use for a CTE in the real world.

From the 24 million rows, I want to query:

1. The number of backups by type2. The number of datasets on each

backup date by backup type3. The total number of datasets backed up

89

WITH BACKUPS AS

(SELECT BACKUP_DATE,BACKUP_TYPE,COUNT(*) COUNTS

FROM DR.RECOV_ALL_REPORT

GROUP BY BACKUP_DATE , BACKUP_TYPE

)

SELECT COUNT(*), BACKUP_TYPE, ' '

FROM BACKUPS

GROUP BY BACKUP_TYPE

UNION ALL

SELECT * FROM

(SELECT COUNTS,BACKUP_DATE,BACKUP_TYPE

FROM BACKUPS

ORDER BY BACKUP_TYPE,DATE(BACKUP_DATE) DESC)

UNION ALL

SELECT SUM(COUNTS) ,' ',' '

FROM BACKUPS

-- S.B CW01 D=003 W=006 M=036

-- FROM DISASTER.MASTER.FILE(GDG)

Common Table Expression to query the DR table

Define the CTE

Get backup type counts

Count backups by date

Show total count

90

SELECT DATE(BACKUP_DATE) ,BACKUP_TYPE ,VOLSER ,DSNAME ,FULLGDG_DSNAME ,COMMENT ,DATA_VOLSER ,DATA_PORTION_NAME ,INDEX_VOLSER ,INDEX_PORTION_NAME ,INDEX_GDG_DSNAME FROM DR.RECOV_ALL_REPORT WHERE DSNAME [like or =] InputdsnameORDER BY Date(BACKUP_DATE) desc

Query to recover dataset from DR backup

91

• // jobcard• /*ROUTE XEQ CWCC • /*JOBPARM S=CWCC • //* • //STEP1 EXEC PGM=ADRDSSU,TIME=1440 • //SYSPRINT DD SYSOUT=* • //TAPE1 DD DSN=BACKUP.DAILY.PCC004.G0573V00,DISP=OLD, • // UNIT=V3590, • // VOL=SER=(896081), • // LABEL=(001,SL,EXPDT=98000) • //DISK1 DD UNIT=3390,VOL=SER=PCC004,DISP=SHR • //SYSIN DD * • RESTORE IDD(TAPE1) ODD(DISK1) TOL(ENQF) -• CATALOG -• DS(INCLUDE(-• BFHDJS0.DEV.EXEC -• )) SHR -• RENAMEUNCONDITIONAL( -• (BFHDJS0.DEV.EXEC, -• BFHDJS0.DEVVV.EXEC))

JCL generated for recovery

92

• Fed from problem reporting and availability facilities

• Parse syslog archives for outage and warning messages

• Store these in Db2 for analysis and recommended remedial actions

• Queries:

– What are most common messages and abends?

– What warning messages lead to abends?

– Are there clusters of messages?

– Any correlations we can find?

• Possible tie-ins to Compuware products

Db2 in Anomaly Detection

93

• Automated testing groups rely on syncing test cases with test data required for exercising those cases

• Db2 provides baseline and set of utility executions supporting groups

• Still have mix of testing group-managed Db2 procedures and ops-executed procedures

• In process of enabling all to be test-group managed

Db2 in Automated Testing

94

• REST with Db2

• SA z/OSMF POC lack of value

• Python/YAML dynamic Db2 table build

Futures for Db2 & Dev Ops

95

Success Indicators

• Elimination of repetitive ops tasks

• More frequent promotion of new code

• More cooperation between mainframe and non-mainframe ops

• Linked cross-platform deployment

96

PROMOTE YOUR SUCCESS

Step 9:

97

Success Indicators

• Publicly recognized ops successes

• Business people recognize your progress

• Improved recruitment of ops talent onto mainframe team

98

Recommended ReadingHelpful Reference Materials

• The Phoenix Project: A Novel About IT, DevOps and Helping Your Business Win

• Start and Scaling DevOps in the Enterprise

• The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations

• “Digital Transformation Needs Mainframe DevOps”

• “The Need For Speed: Drive Velocity And Quality With DevOps”

• “Use Four Key Categories To Measure What Matters In Continuous Deployment”

• “Project to Product” Flow Framework and Value Stream Metrics

• Link to Craig Mullins Db2 presentation referenced

Learn more about future-enabling your mainframe ops, compuware.com.

Learn about bringing modern DevOps best practices to the mainframe by downloading our eBook

“Ten Steps to True Mainframe Agility.”

99

THANK YOUFor more information contact:

Keith.Sisson@compuware.com or

Dave.Seibert@Compuware.com

100© 2019 Compuware Corporation. All rights reserved.

Recommended