What is a problem? A problem is an obstacle, that has to be surmounted. Solving a problem is connected with obstacles. Or more general: Problem solving is a process to get from an unsatisfactory to a satisfactory situation.
Most of us get paid for solving problems. It’s irrelevant if you are paid for solving technical problem (e.g. My computer doesn’t work), or if you are paid to create solutions for customers (e.g. design infrastructure for a Citrix XenApp farm). At the end you solve a problem.
Every problem has characteristics, that can be used to describe it.
Not every problem is solvable. Think about “Squaring the circle“. But often a problem seems to be unsolvable because it’s not well defined. If Initial situation, obstacle and target situation are not clearly formulated, you won’t be able to solve the problem.
If you can decompose a problem into multiple subproblems, it is a hierarchical problem. Otherwise, it’s an elemental problem.
The effort to solve a problem is always different.. A problem is theoretically solvable, but it may require such a high effort, that it is practically unsolvable.
Even if a problem is well defined, it appears different in regard to complexity for different people.
How to start?
First of all
- Understand and define the problem
This is most important part. Before you try to solve a problem, make sure that you have really understood the problem. Then you should define the problem. Only a clearly defined problem can be solved. And it’s much easier to solve a clearly defined, than a vague problem. If it’s a complex problem, then you should try to
- Simplify or decompose the problem
A simplified problem can help you stay focused. If you can’t simplify a problem, you can try to divide it into subproblems. With a clearly defined, simplified/ structured problem, you can start to
- Find the root cause
Collecting information is the key. Collect information about what happened before, during, and after the problem has occurred. Identifying the root cause for a problem can be a time consuming task. But let me say this clearly: Information is the key. Information that help to find the root cause are not only observations (e.g. logs, error messages etc.). You can can use the results of systematic tests. Collect as much data as you can.
Sometimes it can be useful to create a hypothesis.
Scientists generally base scientific hypotheses on previous observations that cannot satisfactorily be explained with the available scientific theories.
If you see that System A is affected, but system B should be affected too, but it’s not, it might be time to create a hypothesis. With a hypothesis in mind, you can try to prove it. Test the hypothesis by performing tests and collecting data. This strategy is called “hypothesis testing”.
At some point, you should have identified the root cause. With the now known root cause, you can
- Create solutions and select the best one
Sometimes it’s easy. But sometimes it’ not that easy. A trade-off analysis can help to identify the best of multiple solutions.
- Create an action plan
Even if you only have to disable a specific feature, it’s a good idea to formulate an action plan. Even if consists only of three lines… You should state clearly
- WHAT you do
- WHY do you have to do it, and
- HOW to you plan to check it
With these steps, you should be well prepared. It doesn’t matter what kind of problem you are trying to solve: The process is basically the same.
Other problem solving methods
Over the years many problem solving methods have been developed. Kepner-Tregoe is one of them. Other well known methods are:
- A3 Problem Solving
- Eight Disciplines (8D) Problem Solving
- Failure mode and effects analysis (FMEA)
A3 Problem Solving has been developed at Toyota for their Toyota Production System (TPS). It’s an often used method in Lean Manufacturing. A3 helps to solve problems by pretending a structure (WHAT IS and WHAT IS NOT the problem, describe the problem, root cause, solution etc). This strucure is placed on an A3 sheet paper (that why it’s called A3). The process is based on the principles of Deming’s PDCA cycle.
PDCA, or Plan-Do-Check-Act (sometimes Shewhart-Cycle) was made popular by Dr. Edwards Deming. Plan-Do-Check-Act refers to the four phases of this cycle.
- Plan: Plan the change
- Do: implement the change
- Check: Check the sucess of the implemented change
- Act: Take action based on the results of “Check”
Eight Disciplines (8D) Problem Solving was developed by the Ford Motor Company. The D0 phase is the starting point for the D8 process, but it’s not counted.
- D0: Plan for solving the problem and determine the prerequisites
- D1: Establish a team of people with the required skills and knowledge
- D2: Describe the problem
- D3: Define and implement containment actions
- D4: Determine and verify the root causes
- D5: Plan permanent corrective actions for the observed problem
- D6: Implement the best permanent corrective actions
- D7: Modify management systems to prevent a recurrence
- D8: Congratulate your team!
The Failure mode and effects analysis (FMEA) is a highly structured, systematic approach for failure analysis. There are different FMEA alalyses:
FMEA is based on inductive reasoning (forward logic). FMEA is based on a highly structured process, which can be represented as followed.
- Structural analysis: A system is divided into its components
- Functional analysis: Identify the function of each component
- Failure analysis: Identify the possible failures for each component
- Calculate the risk: Risk Priority Number = occurrence ranking x detection ranking x highest severity ranking
- Optimize: Optimize the component to mitigate the risk
No matter what, stay organized
The key to successfully solve problems is to stay organized. Solving problems isn’t magic. It is a very structured process that gets better with increasing experience. Try to create your own, structured method. Or use one of the mentioned problem solving methods. But in general:
- Always try to describe a problem
- Try to simplify or break it into smaller problems
- Search and verify for the root cause
- Develop a solution
Feel free to follow him on Twitter and/ or leave a comment.
Latest posts by Patrick Terlisten (see all)
- Space reclamation of VMFS 5 Datastores using esxcli - February 12, 2020
- VCAP6.5-DCV Design – Objective 2.4 Build manageability requirements into a vSphere 6.x logical design - December 31, 2019
- VCAP6.5-DCV Design – Objective 2.3 Build availability requirements into a vSphere 6.x logical design - December 23, 2019