Skip to main content

Respond to and resolve a major incident

Outcome
Your team will demonstrate their response, communication, investigation, resolution, and documentation of an unplanned outage in Jira Service Management that impacts customers.
Processes
  • Define incident roles
  • Train incident teams
  • Acknowledge alerts
  • Engage team
  • Communicate effectively
  • Investigate the incident
  • Engage internal teams
  • Linked issue
  • Change request
  • - Monitor fix
  • Post-incident review
  • Include alert and trigger data
  • Problem ticket
  • Practice area(s)
    Incident management
    Product(s) in scope
    Jira Service Management (Premium or Enterprise), Opsgenie, and Statuspage
    Role(s)
    Technical SME, Incident Manager, Problem Manager, Change Manager, Technical Lead, and Team Member
     
    Respond to and resolve a major incident 1
     
    Step
    Process
    Notes/Resources
    1
    DETECT
    Make sure you select the appropriate service when creating an incident or alert.
    Path 1:
    1. Create and link incidents.
    a. Create multiple "user reported" incidents as an end user.
    b. As an agent, link multiple related incidents to the parent incident.
    c. As an agent, create a major incident from the parent incident.
    d. Create a major incident from the parent incident.
    2. As an agent, create a related alert in Opsgenie and an incident on Statuspage. Major incident creation should launch this automatically.
    Path 2:
    1. As an agent, create an alert of critical priority.
    2. Acknowledge the alert.
    3. As an end user, report an incident.
    4. As an agent, associate the alert to the end user incident.
    5. Create a major incident from the alert by selecting the checkbox next to the alert, select Incident Options, and then select Create Incident from x Alerts. Ensure the affected service is selected.
    Note: Repeat steps 2 to x for this major incident as well.
     
    2
    COMMUNICATE
    1. Validate that Statuspage has displayed the appropriate communication message related to the major incident.
    2. As an end user, validate that a communication message is displayed on the portal.
    3. As an agent, navigate to the major incident and go into the command center. Invite users or other relevant subject matter experts.
  • Integrate Opsgenie with Statuspage
  • Set up a Jira Service Management integration
  • 3
    INVESTIGATE
    1. Initiate an investigation by creating a Zoom video Conference, a Slack channel, or the built-in Incident Command Center in Opsgenie (swarming).
    2. Investigate potential causes via incident investigation. Review timelines of deployment.
    3. Add a bug in Jira Software if the incident is due to a bug.
    4. Communicate with the development team.
    5. Open a change request and link to the major incident.
  • Use the Incident Command Center (ICC)
  • Use Zoom for the Incident Command Center
  • Use chat rooms for incident collaboration
  • Investigate an incident
  • Integrate Opsgenie with Jira
  • 4
    COMMUNICATE
    1. As an agent, send update status in command center.
    2. Update status in Statuspage with deploying fix message.
    3. As an end user, validate the updated message by refreshing the status page.
     
    5
    RESOLVE
    1. Resolve the major incident.
    2. Resolve the user-reported incidents (if any).
    3. Update Statuspage with restored message and status.
    4. As an end user, validate the updated message by refreshing the status page.
    5. Close the related alert.
     
    6
    RESOLVE
    1. Create a postmortem report.
    2. Export the postmortem into the Confluence page.
    3. Close the major incident.
     

    Was this content helpful?

    Connect, share, or get additional help

    Atlassian Community