Notice history

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

Mar 2026

Apr 2026

May 2026

May 2026

Resolved
May 28, 2026 at 3:13 PMUTC
Resolved
May 28, 2026 at 3:13 PMUTC
We have implemented a new upstream link to restore service for the time being. GTT is still investigating the issue on their end; we suspect either a dark fiber cut or a core system failure. So far they have acknowledged the issue and will provide updates over the next hour.
Once GTT service is restored, we will be able to cut back over without any interruption to traffic, and the new link will remain in place as a backup to increase resilience. We do not anticipate any further issues at this time.
Update
May 28, 2026 at 2:13 PMUTC
Update
May 28, 2026 at 2:13 PMUTC
We are working to spin up an emergency link to restore service until the GTT connection at LAX Equinix is repaired.
For context: GTT is used to backhaul traffic between our CoreSite and Equinix facilities. While we do maintain redundant links for this purpose, both are currently down, which has effectively left us without connectivity until the issue is resolved. To address this, we are bringing an additional link online through a separate upstream provider — both to restore service now and to provide added resilience against a similar failure in the future.
Update
May 28, 2026 at 12:51 PMUTC
Update
May 28, 2026 at 12:51 PMUTC
We have confirmed that the issue is not caused by a broken uplink on the upstream device as initially suspected. Instead, the GTT port we receive at this location is experiencing problems. We have shifted our focus accordingly and are in contact with GTT for further updates.
Identified
May 28, 2026 at 12:32 PMUTC
Identified
May 28, 2026 at 12:32 PMUTC
We have identified an issue affecting an upstream device at our LAX Equinix facility. Our LAX core network is hosted separately at CoreSite, so traffic continues to flow normally through our filters and proxies; however, client-facing services hosted in Equinix may experience degraded performance or interruptions. We are actively working with the facility to restore full service as quickly as possible and will provide updates as more information becomes available.

Resolved
May 28, 2026 at 1:55 PMUTC
Resolved
May 28, 2026 at 1:55 PMUTC
As of this morning, the data center has confirmed that the rooftop condensers are fully back in working order, along with the AC units that tripped under the increased pressure when the condensers failed. We are now seeing ambient temperatures restored to normal levels within the facility and do not anticipate any further issues.
Update
May 27, 2026 at 11:11 AMUTC
Update
May 27, 2026 at 11:11 AMUTC
Ambient temperatures in the London data center have returned to near-normal levels. An HVAC technician is on site this morning to complete repairs on the remaining air conditioning unit and to verify the integrity of the other cooling systems in the facility.
We will continue to monitor the situation closely and will provide a final update once all systems have been fully restored and verified.
Monitoring
May 26, 2026 at 7:18 PMUTC
Monitoring
May 26, 2026 at 7:18 PMUTC
Technicians are on site actively working to repair the rooftop condenser and restore the facility to normal operating temperatures. We will continue to provide updates as the situation develops.
Identified
May 26, 2026 at 7:18 PMUTC
Identified
May 26, 2026 at 7:18 PMUTC
We have confirmed the source of the issue: a failed condenser unit on the roof of the data center, compounded by a separate fault with an air conditioning unit inside the facility. Combined with the ongoing heat wave, these failures have caused ambient temperatures to rise to levels that are triggering thermal throttling and, in some cases, thermal-related crashes on affected hardware.
Investigating
May 26, 2026 at 7:07 PMUTC
Investigating
May 26, 2026 at 7:07 PMUTC
We are currently investigating hardware issues affecting our London data center. At this time, network connectivity is not impacted, though some hardware in this location is experiencing problems.
Our team has identified that ambient temperatures in the facility are running 15-20% above recent average, which we suspect is related to the ongoing heat wave in the region. We are working closely with the data center's NOC to confirm the root cause and resolve the issue as quickly as possible.
We will provide further updates as more information becomes available. Thank you for your patience.

Apr 2026

Resolved
April 08, 2026 at 11:35 PMUTC
Resolved
April 08, 2026 at 11:35 PMUTC
We suffered from a dark fiber cut in London. Due to a policy issue in our routing configuration, failover to our secondary dark fiber route was not instantaneous. This has been resolved. Traffic is now flowing over our 2nd path and traffic has been restored. In the meantime we'll be repairing the primary path.
Identified
April 08, 2026 at 11:28 PMUTC
Identified
April 08, 2026 at 11:28 PMUTC
We have identified an issue with our primary dark fiber connection and provider. We are working on resolving the issue with that, and in the processing of forcing a failover of traffic.
Investigating
April 08, 2026 at 11:15 PMUTC
Investigating
April 08, 2026 at 11:15 PMUTC
We are currently investigating this incident. Our dark fiber connection between our network PoP and DC in London is currently unstable, and we are investigating.

Postmortem
April 03, 2026 at 6:32 AMUTC
Postmortem
April 03, 2026 at 6:32 AMUTC
At approximately 1:25AM EST during a routine filter re-deployment (as a hotfix) to address concerns visualized through our monitoring, the deployment process of our Luascript code went haywire. This was not due to any code logic or bugs introduced into the code, as we redeployed the hotfix moments after without issue, but what we believe to be an extremely unlucky software bug that cascaded through all PoPs due to the way the syncing system works. This was not a full outage, but it is what we would classify as a major outage with more than 80% of traffic being dropped at the time.
The issue was 95% resolved within the first 10 mins by bringing back up Ashburn, Dallas, London and Amsterdam without issue. Los Angeles took about 12 minutes extra due to needing to reboot the filtering appliances there, and gradually shifting traffic back online. Frankfurt was not affected during this time and its traffic was still flowing through the network.
We are currently investigating why this specific filter deployment went haywire, as we have deployed code over 30 times in the month of March without a single issue, so the situation is obviously one of concern to our customers, and we acknowledge that and are investigation due to it's how peculiar this situation was.
We do also acknowledge how the 3 outages within the last 30 days are a significant cause for concern, and we do not want to make excuses, but we do want to iterate that all 3 outages have been caused by things out of our control at the time, but we are currently implementing ways to control them. We do understand that to you, our customers, these are in our control as you trust us to maintain the stability of the network you use, so we take full responsibility for the outages even if they weren't directly caused by us, and are making it our goal to implement increased redundancy and resiliency.
We are currently in the process of shipping out upgraded filtering hardware & software, upgraded routers and upgraded router components to critical locations to not only introduce further redundancy, but as previously mentioned to improve resliency overall.
We will have more to share on this in the coming weeks, but rest assured we are working on the issues, and we completely understand your frustrations as a customer. Do not hesitate to voice any concerns to us, and we will be glad to respond.
Please also take a look at the RFO for the outage in Dallas last week and our plans for the future:
https://status.as30456.net/cmn9oqp8405n49b3k562hrh6i
Resolved
April 03, 2026 at 6:09 AMUTC
Resolved
April 03, 2026 at 6:09 AMUTC
This incident has been resolved.
Update
April 03, 2026 at 5:50 AMUTC
Update
April 03, 2026 at 5:50 AMUTC
All PoPs have recovered except for LAX. We are working on restoring traffic to LAX.
Monitoring
April 03, 2026 at 5:41 AMUTC
Monitoring
April 03, 2026 at 5:41 AMUTC
We implemented a fix and are currently monitoring the result.
Investigating
April 03, 2026 at 5:32 AMUTC
Investigating
April 03, 2026 at 5:32 AMUTC
We are currently investigating this incident.

Mar 2026

Postmortem
April 03, 2026 at 6:31 AMUTC
Postmortem
April 03, 2026 at 6:31 AMUTC
Our explanation:
Over the past few weeks, some of you may have experienced brief service interruptions across parts of our network. We want to be upfront about what happened, what we've learned, and most importantly what we've done about it.
Our edge routers have always been built with internal redundancy in mind - redundant supervisors, redundant power supplies, multiple line cards, and redundant fabric modules. That level of hardware resilience handles the vast majority of failure scenarios well.
However, the recent outages exposed a gap: when an issue affects the chassis itself such as a software defect, a firmware upgrade that requires a full reload, or cases like where a software process on a router crashes (as has happened recently in London -> Twice) - there was no second device to immediately absorb the traffic. The router was redundant in every way except the one that mattered in these incidents.
What we're doing:
We're rolling out a dual router design across all six of our points of presence — Dallas, Ashburn, Los Angeles, London, Amsterdam, and Frankfurt. Once complete, every PoP will operate with two independent edge routers in an active/active configuration, with full BGP session redundancy to all upstream and peering partners. If an entire chassis needs to be taken offline for maintenance, a software upgrade, or an unexpected failure then traffic will automatically reconverge on the second device with no customer-facing impact.
Each router in the pair will run on independent power feeds with independent management and control planes. We're also using this as an opportunity to standardize failover testing procedures across all PoPs, so this architecture is validated continuously, not just at deployment. This also provides protection against cases where a configuration change (with possibly human error involved) leads to a change which ends up knocking out a bunch of traffic. The investments for these changes were made during the last couple of weeks, so were already in the works and unrelated to incidents in March, but with summer right around the corner we wanted to let you know you'll be in good hands.
These changes will also allow for things like DDoS mitigation changes to be performed in a more controlled rollout (e.g. to parts of traffic only), zero-downtime maintenance windows for core networking equipment and a stronger foundation for the capacity expansions we have planned for the rest of 2026.
Resolved
March 28, 2026 at 2:03 PMUTC
Resolved
March 28, 2026 at 2:03 PMUTC
This incident has been resolved.
Monitoring
March 28, 2026 at 4:43 AMUTC
Monitoring
March 28, 2026 at 4:43 AMUTC
The incident was resolved shortly after onset, and we have been monitoring since. A full RFO will be posted when this status is closed. In short, the root cause was a cascading failure triggered by a bug in the routing software itself — not by any action taken by our team. The issue was entirely outside our control, and we responded quickly to restore normal operation.
Identified
March 28, 2026 at 2:08 AMUTC
Identified
March 28, 2026 at 2:08 AMUTC
We identified the root cause, and have been working on resolution. Recovery efforts are showing progress as traffic is beginning to restore in Dallas.
Investigating
March 28, 2026 at 1:55 AMUTC
Investigating
March 28, 2026 at 1:55 AMUTC
We are currently investigating this incident.

Resolved
March 26, 2026 at 9:11 PMUTC
Resolved
March 26, 2026 at 9:11 PMUTC
This incident has been resolved.
Monitoring
March 26, 2026 at 7:31 PMUTC
Monitoring
March 26, 2026 at 7:31 PMUTC
We located the problem to be a private transport link between AMS and LON had caused the drop due to issues outside of our control. The resulting drop was because of the traffic shifting over. As a result of the private transport link failure, we have switched the mode of traffic for this transport, and do not expect it to occur again, however we are monitoring the situation in the event it does. We have also linked this private link to a few other recent small drops in the EU region (unrelated to the last status post which was a hardware issue), and apologize for any impact from that.
In regards to network stability, our London PoP will be undergoing an upgrade to improve a routing appliance constraint, and furthermore we will be implementing increased redundancy across all PoPs in the near future to increase resiliency during both planned and unplanned outages.
Investigating
March 26, 2026 at 6:37 PMUTC
Investigating
March 26, 2026 at 6:37 PMUTC
We are currently investigating a brief, but sizable blip of traffic that occurred for users routing through Amsterdam.

Resolved
March 23, 2026 at 4:46 PMUTC
Resolved
March 23, 2026 at 4:46 PMUTC
Following emergency maintenance yesterday that required a reboot of a core router in our London facility, an Arista runtime software bug caused the router's ARP entries to gradually decay from active memory.

Although the router's configuration remained correct throughout, the hardware chip (ASIC) responsible for directing network traffic failed to correctly reload the address mappings after the reboot. These mappings are what tell the router how to reach a set of internal endpoints used for multicast traffic forwarding. With them missing from the hardware's active memory, traffic that should have been flowing through those paths was silently dropped.

Because the configuration itself was never corrupted, the root cause was not immediately obvious. A number of other potential causes were investigated before the true issue was identified — a desync between the router's stored configuration and what the hardware had actually loaded into memory.

We sincerely apologize for the impact this had on your services and for the time it took to identify the root cause. We understand how frustrating extended investigations can be, and we appreciate your patience while our engineers worked methodically through the contributing factors to reach a definitive resolution.
Monitoring
March 23, 2026 at 3:39 PMUTC
Monitoring
March 23, 2026 at 3:39 PMUTC
We have implemented another round of fixes and connectivity is recovering. Please reach out to us if you are still having issue while we continue to monitor.
Thank you again for your patience in this matter. We will provide a full report when we confirm all is well.
Update
March 23, 2026 at 2:10 PMUTC
Update
March 23, 2026 at 2:10 PMUTC
We are continuing to investigate TCP issues in the London PoP. We apologize for the continued problems today and are making progress toward a full resolution for this location.
Identified
March 23, 2026 at 11:47 AMUTC
Identified
March 23, 2026 at 11:47 AMUTC
We are continuing to monitor reports of elevated issues and are still working towards a permanent resolution.
Monitoring
March 23, 2026 at 10:40 AMUTC
Monitoring
March 23, 2026 at 10:40 AMUTC
We have rolled out a batch of fixes and are seeing connectivity recover. We are continuing to monitor the situation closely.
Identified
March 23, 2026 at 9:00 AMUTC
Identified
March 23, 2026 at 9:00 AMUTC
We have identified an issue with our filtering software in our London PoP and are working on a resolution as quickly as possible.

Quick maintenance to resolve packet loss problems in London

Completed
March 22, 2026 at 4:15 PMUTC
Completed
March 22, 2026 at 4:15 PMUTC
Maintenance has completed successfully
Planned
March 22, 2026 at 4:00 PMUTC
Planned
March 22, 2026 at 4:00 PMUTC
Quick maintenance to resolve packet loss problems in London
In progress
March 22, 2026 at 3:15 PMUTC
In progress
March 22, 2026 at 3:15 PMUTC
Maintenance is now in progress.

Mar 2026 to May 2026

Cosmic Global - Notice history

All systems operational

Notice history

May 2026

Apr 2026

Mar 2026

Our explanation:

What we're doing: