Issue Summary:
On February 26, 2026, at 07:48 PM, customers experienced significantly elevated response times across the platform, which in several cases resulted in temporary service unavailability. Users reported slow page loads, delayed API responses, and intermittent timeouts while accessing the system. The incident impacted multiple customer environments and lasted approximately one hour before full service restoration.
Root Cause:
The issue was attributed to a sudden and unanticipated spike in database requests, which caused a sharp increase in database CPU and connection utilization. This surge led to resource contention at the database layer, resulting in slower query execution times and increased request queuing.
Corrective Action:
Upon identification of the issue, the engineering team took immediate action to scale up database resources. This included increasing compute capacity and optimizing database throughput to handle the elevated request volume.
Preventive Measures: