CEO Outage

Incident Report for SNworks

Postmortem

At approx 8:15pm EST CEO suffered an outage related to primary database. Engineers were notified by automated alerting at 8:21pm EST. The issue was identified and a fix implemented. Service recovery began at approx 8:45pm EST. Post mortem to follow.

On the evening of February 17th at about 8:15pm EST the main CEO API, which powers both CEO2 and CEO3, suffered a severe outage. Service started recovery around 8:45pm EST and was fully recovered by 9pm EST. CEO incurred about 24 minutes of total downtime.

Engineers were alerted to the outage within 5 minutes by automated alerting systems.

What went wrong

The core of the issue is that the main CEO database ran out of storage space. Normally, when the database capacity drops below 20% an automatic process will:

  • Swap the secondary database in as the primary.
  • Resize the primary database’s storage.
  • Restart the primary.
  • Swap the primary back in for the secondary.
  • Resize the secondary.

All of this happens in the background with very limited, if any, service interruption.

The issue last night was caused by a required security certificate update necessary for database connectivity. When resizing database storage, you are not able to make additional modifications at the same time. So, when the database attempted to update the storage capacity, the certificate attempted to update itself at the same time, resulting in a failed update.

Since the primary and secondary are exact copies of each other, the same issue impacted both databases at the same time.

How we recovered

Upon realizing the the core issue, we determined a back-channel method for forcing the database to resize without requiring further changes. Once the primary was recovered, we reenabled connections and recovered the secondary.

Moving forward

We’ve updated our storage capacity alerting to provide for “louder” warnings when database storage drops below 20%, then 15%, then 10%.

If you have any specific questions, please do not hesitate to reach out to support@getsnworks.com.

Posted Feb 18, 2026 - 12:52 EST

Resolved

At approx 8:15pm EST CEO suffered an outage related to primary database. Engineers were notified by automated alerting at 8:21pm EST. The issue was identified and a fix implemented. Service recovery began at approx 8:45pm EST. Post mortem to follow.
Posted Feb 17, 2026 - 20:30 EST