ITTM: Troubleshooting Skill Manual

Preview:

DESCRIPTION

Skill profiling and modelling IT troubleshooting

Citation preview

IT

Troubleshooting Management

Troubleshooting Skill Manual

Analyst often get ask how problem got fixed not how the problem got identified.

NT

Chapter 1 [skill principles]

Chapter 3 [skill model]

Chapter 4 [skill exercise]

Chapter 2 [skill profile]

Identify the cause of the problem

TROUBLESHOOTING

MAIN GOAL:

Test every suspect component until the fault is identified

MAIN TASK:

What makes a problem more challenging to troubleshoot than others ?

It depends on the number of suspect components that needs to be tested.

2 components 100+ components

The higher number of suspect faults, the harder it is to troubleshoot.

20 components 100+ components

Locate the fault

First troubleshooting objective:

Identity of the fault

Second troubleshooting objective:

How do you troubleshoot?

Troubleshooting components with physical boundaries,

use your eyes.

Troubleshooting components with no physical boundaries,

use logic.

10101010101010101010101010101010101010101 10101010101010101010101010101010101010101 10101010101010101010101010101010101010101 10101010101010101010101010101010101010101 10101010101010101010101010101010101010101 10101010101010101010101010101010101010101 10101010101010101010101010101010101010101

Troubleshooting digital component inside another digital component,

10101010101010101010101010101010101010101 10101010101010101010101010101010101010101

101010101010101010101010101010101010101

101010101010101010101010101010101010101

101010101010101010101010101010101010101 10101010101010101010101010101010101010101 10101010101010101010101010101010101010101

use logical isolation.

Chapter 1 [skill principles]

Chapter 3 [skill model]

Chapter 4 [skill exercise]

Chapter 2 [skill profile]

Lets define troubleshooting

SPKG

Unknown and undocumented technical issue

Problem X

Replacing or reinstalling every component until problem X is resolved is NOT troubleshooting.

SPKG

Applying known popular fixes with the hope of fixing problem X is NOT troubleshooting.

SPKG

Troubleshooting is the use of technical deduction and logical isolation to identify the cause of problem X.

SPKG

Lets define technical deduction

technical knowledge + deductive logic =

technical deduction

take out technical deduction skill from the equation and troubleshooting will always start at:

“Is the power ON?”

subtract functional components from suspect components to minimize the number of components to isolate/test

technical deduction

objective:

Example

Web Server Web

Application

Web

Application

Account

Web

Application

Account

Password

problem X’s suspect components

Based from the problem analysis, user’s neighbor can login and work on the same web application.

Example

Web Server Web

Application

Web

Application

Account

Web

Application

Account

Password

Web Server Web

Application

Therefore, troubleshooting efforts should be limited to Web application account and password.

X X

Example

Web

Application

Account

Web

Application

Account

Password

Analyst with good deductive skill, can minimize the number of suspect components on any given problem.

Analyst with excellent deductive skill, can locate the fault on any given problem.

Technical deduction skill is a required skill for any front line analyst.

lets define logical isolation

isolation definition: to set one or a group of technical components apart from others

why isolate? to test the component’s functionality

Individual components with physical boundaries can be easily isolated and tested

A

Testing A by

INCLUSION

Testing A by

EXCLUSION

A A

Types of Isolation

But component with no physical boundaries needs to be isolated logically

A

Hardware Component

Software Component

The faster one can isolate and test any given suspect component, the faster one can identify the problem cause.

fault isolation eff ic iency

# faults isolated / time =

take out isolation skill from the equation and troubleshooting will always start at:

re-installing every component

lets define the methods in

logical isolation

4 methods in logically isolating (hardware and software) faults:

simpl i fy shorten

compare error for error [ ]

[simplify]

Process of excluding complex component and substituting it with a basic component. “Is there a substitute or alternative core component ?” Example: wireless router to LAN cable; production server to test server; setting application setting to plain vanilla mode

[shorten]

Process of minimizing the process or excluding unnecessary components without altering the intended goal. “Can this component be taken out without altering the overall process?” Example: remote application to local application; network printing to local printing

[compare]

Process of taking suspected component and comparing it with a known working component or environment. “Is there a working model to compare this with?” Example: side by side comparison; a tested component inserted in place of a suspected fault

[error for error ]

Process of injecting known error into the target fault. The sequence of the expected error compared to the original error will help determine the location or identity of the fault. “Is there documented and reversible error that can be applied to this fault?” Example: intentionally entering an incorrect password, which came first “wrong password “ or original error ?

ide nt i f y proble m caus e

logical isolation objective

analyst with good isolation skills, have the ability to isolate any given software sub-components

analyst with excellent isolation skill, have the ability to narrow down the number of suspect components by 50% on the first isolation task

Chapter 1 [skill principles]

Chapter 3 [skill model]

Chapter 4 [skill exercise]

Chapter 2 [skill profile]

troubleshooting skill model

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

SPKG

Analyst encounters Problem X

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Analyst perform DUE DILIGENCE : Recreate issue, check policies and procedures; determine if it’s a procedural issue Check knowledge base or Google; determine if it’s a known issue If issue is unknown Analyst then proceeds to the next task

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Perform Root Cause Analysis based from:

Technical knowledge Awareness of technical components in play Hardware and software behavior

Analyst will theorize what caused Problem X

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Analyst makes a mental list of initial suspected faults “if the 1st component tested is not the fault, what’s the next potential fault?” Ideal number of faults = 2 or greater

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Sample Potential Faults

Fault

1

Fault

2

Fault

3

Fault

4

Fault

5

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Analyst will then gather more information to qualify or disqualify faults by asking questions. Using the sample Task Table below, asking if Task E can be performed is the most efficient.

Task Fault Tested

Task A Fault 1

Task B Fault 2

Task C Fault 3

Task D Fault 1 & 2

Task E Fault 1,2 & 3

Technical Deduction

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate

suspected faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Since Task E can be performed, suspected faults is down to 2.

Fault

1

Fault

2

Fault

3

Fault

4

Fault

5 X X X

Technical Deduction

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate

suspected faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Isolation Method

Fault Isolated

Notes

Method X Fault 4 Positive Result = fault 4 is functional

Method Y Fault 5 Positive Result = fault 5 is functional

Method Z Fault 4 & 5 Positive Result = fault 4 is functional

Negative Result =

fault 5 is functional

Next troubleshooting procedure is to isolate / test the 2 remaining suspected faults. Using the sample Isolation Table below executing Method Z is the most efficient.

Fault Isolation

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Applying method Z resulted in a negative result, which means problem cause is Fault 4.

Fault

4

Fault Isolation

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Apply fix to confirm fault is identified.

Problem X

Analyze Problem

List all potential faults

Apply technical deduction to eliminate suspected

faults

Apply Fix

Test / Isolate remaining faults until problem cause is

identified

Chapter 1 [skill principles]

Chapter 3 [skill model]

Chapter 4 [skill exercise]

Chapter 2 [skill profile]

fault consistency exercise

As a team, pick an error or symptom from your existing knowledge base without looking at the problem resolution. Write down all potential faults. Now compare your list with other team members. Faults listed should be more or less the same.

Objective: test analyst’s technical knowledge and component awareness; expose team’s knowledge

inconsistency

technical deduction exercise 1

1) Pick one component ( hardware or software ) and write down the task that can be performed if this component is functional.

2) Convert this statement in a form of a question. If analyst interface with a non technical customer, use a non- technical question (NTQ).

3) Repeat process.

Objective: test analyst technical knowledge and develop individual deductive skill

technical deduction exercise 1

Component: Database Application Statement: Ability to access all customer ‘s activity and history Question: Are you able to query past customer activity? NTQ: Are you able to see yesterday’s activity?

Example

technical deduction exercise 2

1) List 2 components and write down a common task shared by both components if they are functional.

2) Convert this statement in a form of a question. If Analyst interface with a non technical customer, use a non- technical question (NTQ).

3) Increase the number of components and repeat process

Objective: test analyst technical knowledge and develop efficient deductive skill by grouping

Components: file server and folder access Statement: ability to open folder and copy / store files Question: Are you able to browse the folder contents and manipulate it ? NTQ: Are you able to see all the files from that folder and copy files ?

Example

technical deduction exercise 2

technical deduction exercise 3

1) List 4 technical faults.

2) Use a minimum number of individual and group deduction questions combination to eliminate every fault. The less questions the better.

3) Increase the number of technical faults and repeat process

NOTE: start this as a written exercise, then get a partner to make it as a

verbal exercise

Objective: develop reflexive deductive skill

logical isolation exercise 1

1. List one technical component

2. Write down which isolation methods can be applied and how

3. Increase number of components and repeat process

Objective: test analyst individual isolation skill

logical isolation exercise 1

Component: Application Setting file Simply Method: Yes, change setting to a plain vanilla or generic mode Shorten Method: No Compare Method: Yes, copying a known working file in place of this file or vice versa Error for Error Method: Yes, changing the parameter X to Y will create Error 123

Example

logical isolation exercise 2

1. List 2 components

2. Write down how to isolate all components in 1 step. Using common task for

both components to isolate is also acceptable.

3. Increase the number components and repeat process

Objective: develop efficient group isolation skill

logical isolation exercise 2

Component: PDF File and Network Printer One Step: Print PDF document

Example

troubleshooting exercise 2

Verbal Exercise

1. Ask a partner to state 5 components

2. Use a minimum number of deductive question and isolation task combination to cover all components. The lower the number the better.

3. Increase number of components

Objective: develop efficient troubleshooting skill

Online IT Troubleshooting Manager’s course can be found at

https://www.udemy.com/ittm_manager

Click here to see introduction video

Recommended