At around 9:30am EST on Wednesday, Nov 26, engineers were immediately alerted that email sending was not present in the logs and has not ran for over 15 minutes. Engineering then started investigating the root cause.
At 9:45am EST, Engineering found that the root cause was due to a AWS resource that was not working as intended. Engineering tried recreating the resource with no results.
Immediately afterwards, Engineering created a temporary solution that we spun up to remediate the problem. Engineering then notified AWS through support channels of the issue.
At 10:30am EST, Engineering was contacted by AWS that several resources were affected from a cascading problem relating to AWS Kinesis Streams which was also affecting AWS CloudWatch Events.
At 7:00pm EST, Engineering was observed recovery of AWS CloudWatch Events and turned off the temporary solution in place. At 10:00pm EST we saw full recovery.
Our hope is to have rely less off external utilities to run Interseller and eliminate complex dependencies likes these in the future.