As a product manager, you’re guaranteed to come across at least one large product-related crisis at work, if not multiple ones. Therefore, crisis management is a core skill for any product manager.
Through observing senior product leadership, learning from peers, and reflecting on my own experiences, I’ve found that the most productive way to tackle a crisis is to break it down into four stages: identification, mitigation, diagnosis, and prevention.
But before I dive into each of the four stages of crisis management, I want to highlight a particular mindset that you need to have as you resolve the issue.
The Golden Mindset
Never blame others.
Blame is counterproductive. When you blame someone, you place that person on the defensive, which means they are far less likely to cooperate with you to resolve the issue.
I’ve seen too many times where the person who could fix the issue the fastest was exactly the person who was blamed - leading to resentment, disempowerment, and ultimately reducing effectiveness.
Furthermore, blame doesn’t actually solve the issue. It might feel good in the moment, but you still haven’t taken any active steps towards resolving the crisis.
But what if you’re not blaming others, but others are blaming you?
That’s completely fine. If people aren’t willing to move forward without blaming someone, then take the blame yourself and keep the ball rolling. As the product manager, it’s your responsibility to get the team across the finish line, even if it means sacrificing your pride for a moment.
By being mature enough to absorb blame, you’ll find that others will respect you as a calm and thoughtful leader.
Side note: if you do find that you work in a culture where you’re always getting blamed, consider working in a more supportive environment. Our PMHQ community is full of fantastic people who have open positions at their companies!
Now that we’ve discussed the correct mindset for crisis management, let’s work through each stage of a crisis.
You’re hearing reports of a problem, but you’re not quite sure what it is.
Your first task is to understand what the problem exactly is. What parts of the product aren’t working correctly? How many people is it impacting? What’s the frequency and severity of the problem?
It’s critical to first identify what exactly the problem is since that sets the foundation for your efforts moving forward. If you misidentify the problem, you’ll wind up working in the wrong direction.
Work quickly but calmly. One of the best ways to identify the issue is to reproduce it yourself and document the conditions under which you reproduced it.
Is there a particular operating system involved, or a particular web browser? Is it only on mobile devices? Does it only happen in particular user flows and not in other ones?
Now that you know what the problem is, you need to determine whether you can mitigate the issue before fully resolving it.
The reason here is that mitigations generally cost less time and resources to implement than a full solution to the problem.
After all, the goal of crisis management is to reduce the impact and the duration of the problem as much as possible, across its entire lifespan.
By mitigating upfront, you drastically cut the total impact of the problem.
As you identify mitigations, determine the tradeoffs that are required for each particular mitigation.
How many resources will you allocate to the problem? What do you wind up losing by tackling the problem in this way?
Contain the impact by proactively communicating with relevant stakeholders, so that there are no surprises down the road. After all, one of the largest impacts of an unexpected problem is the amount of surprise and confusion it creates. By reducing surprise and confusion, you reduce the negative impact as well.
Where possible, create backup plans or workarounds so that the majority of users can at least complete their critical tasks on your product.
No matter whether you can mitigate the issue or not, you must diagnose the problem correctly to fully resolve it.
Note that diagnosis is different from identification. To use a medical analogy, identification is finding symptoms, like a cough and a fever. Diagnosis is finding the actual cause, like a bacterial infection.
Start with your investigation. You should already have a sense of the different flows, scenarios, and variables that wound up triggering the problem. What underlying cause do these symptoms point to?
If you have a team to help you split up the work, document each thread of inquiry and move in parallel.
Once you’ve found the root cause, determine how long it will take to resolve the issue, and what risks you might be taking in resolving it. Remember that fixing a bug can wind up creating even more bugs, so be thoughtful and cautious.
Given that your diagnosis is correct, go ahead and resolve the problem now.
Ensure that you inform stakeholders of what the cause of the issue is, how long it will take until the problem is resolved, and how long it will take to confirm that no new problems were created from the resolution. Spin up a dedicated Slack channel so that you can keep everyone in the loop.
Just because you fixed the issue doesn’t mean that you’re done with crisis management! At this point, your goal is to ensure that problems of this nature never appear again.
I recommend the 5 Why’s framework to dig deep into underlying systemic problems. Gather up your core team and key stakeholders, set aside sufficient time to thoughtfully work together, and ask "why" in a non-judgmental manner. Again, never assign blame!
Don’t be afraid to branch out - the more thoroughly you conduct this exercise, the less likely you’ll have bugs of this kind again. The goal is to determine how to shore up existing processes or to create new processes that will prevent the issue from happening again.
Let’s walk through an example to show how you can use the 5 Why’s framework.
Initial Problem Statement: Conversion rate for all forms dropped dramatically, and we didn’t know about it for a week.
Why #1: Why didn’t we know about the drop?
Answer #1: We didn’t know about the drop because we don’t have any monitoring systems in place to throw alerts when key metrics drop below some threshold.
Resolution #1: Create a monitoring system to check on key metrics that will alert the team accordingly.
Why #2: Why did conversion rate drop?
Answer #2: It dropped because there was a pop-up ad in the way that we didn’t know about.
Resolution #2: Remove the pop-up ad, if it hasn’t already been removed.
Why #3: Why was the pop-up ad there, and why didn’t we know about it?
Answer #3: The pop-up ad was there because there was a new initiative to increase ad revenue, and the form is the most prominent place on the page. We didn’t know about the ad because there are no communication channels between the ads team and the forms team.
Resolution #3: Create a communication channel between the ads team and the forms team.
Why #4: Why don’t we communicate across teams more regularly?
Answer #4: We don’t have communication because we haven’t needed to work together before. Both teams used to own different pages, but now they’re influencing each other.
Resolution #4: Implement a regular check-in with the ads team to understand where ads will appear in the future. Also, regularly inform the ads team of any upcoming changes to the forms or flows that may impact them.
You can see here that by digging deeper and providing a resolution at each layer of the 5 Why’s, we create redundancy and multiple fail-safes that reduce the likelihood of a future crisis.
Note that 5 Why's doesn't actually mean that you ask "why" 5 times - rather, you should iteratively ask "why" until you have satisfactorily found root causes and processes to fix. In our example, we only needed to ask 4 times before we got to a root process issue.
Caution: don't use a fixed of "why's", or else you'll incentivize teammates to answer in a shallow manner. Be thoughtful and flexible as you conduct this exercise with the team.
It’s never fun to deal with crises. Remember that everyone’s going to be stressed - so as a leader, you need to reduce stress and resolve the issue as soon as possible.
Never blame others. Identify the issue, mitigate it where possible, diagnose and resolve it, then prevent issues like these from appearing again.
By holding these crisis management principles in mind, you’ll find that you, your team, and your company will grow stronger and stronger from each crisis.
Have thoughts that you'd like to contribute around crisis management? Chat with other product managers around the world in our PMHQ Community!
Clement Kao is a Co-Founder of Product Manager HQ. He is currently a Product Manager at Blend, an enterprise technology company that is inventing a simpler and more transparent consumer lending experience while ensuring broader access for all types of borrowers.