Comms lead role - PaaS Incidents
What to do first:
Learn about Pagerduty
- Pagerduty calls/alerts you
- Go on Slack - PaaS incident channel (and PaaS internal channel)
- Set up a hangout for you and the engineer (add tenant only if it will be useful)
- Make a copy of the incident report name the document and start to fill it in
- Record the timeline of events as they happen in the incident report
Statuspage - creating an incident:
Learn about Statuspage
- StatusPage.io - log in and create an incident (name it)
- Select ‘apply template’ (for example, possible issue being investigated/we’re having an incident)
- Fill in the template with relevant details
- You can choose the components (for example, API, apps, billing) that are affected
- Select ‘send notifications’ to email tenants. If it’s a small issue, you may not want to send them a notification (in this case, only our status page will be updated)
- The subscribers on statuspage will get the notifications
If you need to escalate to SMT on call:
- If you need to escalate to SMT (for example, if its affecting coronavirus services) - go to rotas app and select the current on call individual to get their contact info
Don’t forget:
- Your aim is to do just enough support out of hours to get through to working hours :)
- You can update the x-gov slack paas channel if relevant
Response times for P1 incidents
During working hours (9am to 5pm Monday to Friday)
Start work and respond: 20 minutes Tenant updated: 1hr
Outside working hours
Start work and respond: 40 minutes Tenant updated: 1hr