Resolved -
This incident has been resolved.
On November 27, 2023 at 18:46 UTC, we attempted to rotate our OpenID Connect (OIDC) authentication flow certificates. Due to an error in the certificate formatting, we uploaded an invalid certificate configuration that was not observed in our pre-production testing. Our background job servers were unable to start because a valid configuration is required at worker start up. As a result, users experienced delays in Pull Requests, Webhooks, Issues, Actions and Projects. Rollback of the change was slowed by the invalid certificate as our deployment system relied on the same certificate. Rollback was completed at 20:35 UTC. Most services recovered by 20:44 UTC.
Delayed updates to Issues and Pull Requests were applied normally once the changes were rolled back. After the change was rolled back, a large queue of Actions-related jobs built up which included Pull Request, Pull Request review and Pull Request review comment events. About 2.3% of Actions jobs failed during the duration of the incident. Job queue times returned to normal once all remaining jobs were processed.
We are working to improve our certificate testing and rotation process to reduce the risk of customer-impacting errors.
Nov 27, 21:11 UTC
Update -
Webhooks is operating normally.
Nov 27, 20:44 UTC
Update -
Issues is operating normally.
Nov 27, 20:44 UTC
Update -
Pull Requests is operating normally.
Nov 27, 20:43 UTC
Update -
Actions customers are experiencing workflow start delays as part of the ongoing PRs incident. We are seeing previously delayed runs kick off and will continue to monitor.
Nov 27, 20:39 UTC
Update -
Actions is experiencing degraded performance. We are continuing to investigate.
Nov 27, 20:30 UTC
Update -
Customers are also experiencing delays in webhook delivery and issue updates. We are seeing recovery and are continuing to monitor.
Nov 27, 20:22 UTC
Update -
Webhooks is experiencing degraded performance. We are continuing to investigate.
Nov 27, 20:16 UTC
Update -
Issues is experiencing degraded performance. We are continuing to investigate.
Nov 27, 20:16 UTC
Update -
Customers are seeing delays in pushed commits appearing on pull requests. We are currently investigating.
Nov 27, 19:46 UTC
Investigating -
We are investigating reports of degraded performance for Pull Requests
Nov 27, 19:43 UTC