Description: Our primary upstream provider had ~2 minutes of downtime [12:09PM PST to 12:11PM PST] due to a bad BGP route filter being applied. Our monitoring platform indicated our secondary upstream provider was down, but this was a false positive. Customer traffic failed over to secondary upstreams as expected, with sub 1 minute total downtime to re-establish.
Root Cause: Maintenance work upstream from us applied a bad BGP route filter on transit ports which affected everyone using said transit ports, not limited to Crunchbits.
Resolution: We failed over to secondary upstream providers as expected, but some users will have noticed an interruption to existing active connections during this period. Angry message sent on behalf of all customers :)
We are continually monitoring and getting updates from the primary upstream transit. We don't expect any more issues, but if we have them we may elect to force all traffic over our secondary providers until things smooth out to minimize/reduce interruptions to our customers.
Tuesday, February 21, 2023