On February 10th of 2022, at our datacenter provider's network in the Dallas region, the Top-of-the-Rack switch (AS-R40) experienced a malfunction during a standard operation. A new customer was getting installed via a trunk hand-off to the switch causing the event. Unfortunately, the root cause is still unknown to our provider’s engineering department. The DC’s engineering team will attempt to reproduce the issue in their lab to understand the event's root cause further.
While implementing the new customer setup, the DC engineers noticed multiple BGP session flapping and Layer 2 issues. The engineering team also noticed that one of the virtual-chassis members was functioning abnormally, causing the aforementioned issues.
After much effort troubleshooting the issue, the DC’s engineers decided to remove the FPC member causing the disruption. After the malfunctioning FPC was removed, all standard functionality was resumed. The malfunctioning FPC member was wiped clean of its configuration and added back to the virtual chassis.
We sincerely apologize for the inconvenience this event has caused you. During our 10+ years of operations, this was the first downtime event that affected so many clients and services for more than one hour (most clients were not affected, however some had intermittent network connectivity for up to 4 hours). If you were affected by this downtime please open a support ticket and we’ll issue the necessary SLA refunds as described in our TOS.