Cloudflare Attributes Recent Service Outage to Password Rotation Error

A credential rotation error led to widespread service disruptions across multiple Cloudflare products on March 21, 2025, affecting customers globally for over an hour.  The company disclosed that 100% of write operations and approximately 35% of read operations to their R2 object storage service failed during the incident window that lasted 1 hour and 7 […] The post Cloudflare Attributes Recent Service Outage to Password Rotation Error appeared first on Cyber Security News.

Mar 26, 2025 - 14:34
 0
Cloudflare Attributes Recent Service Outage to Password Rotation Error

A credential rotation error led to widespread service disruptions across multiple Cloudflare products on March 21, 2025, affecting customers globally for over an hour. 

The company disclosed that 100% of write operations and approximately 35% of read operations to their R2 object storage service failed during the incident window that lasted 1 hour and 7 minutes.

During a routine security rotation, engineers unintentionally deployed new authentication credentials to a development environment rather than production, causing the outage. 

When the old credentials were subsequently deleted from their storage infrastructure, the production R2 Gateway service—which serves as the API frontend—lost authentication access to backend systems.

Stop attacks before they start, powered by a 97% precise neural Network to Detect Cyber Attacks

“This incident happened because of human error and lasted longer than it should have because we didn’t have proper visibility into which credentials were being used by the Gateway Worker to authenticate with our storage infrastructure,” Cloudflare explained in their incident report.

Cloudflare Service Disruption

The technical error stemmed from the omission of a critical command-line parameter.

Engineers executing the wrangler secret put and wrangler deploy commands failed to include the –env production flag, causing credentials to be deployed to a non-production Worker instead of the intended production environment. 

When the previous credentials were removed, authentication failures cascaded across services.

Beyond R2 object storage, the outage impacted numerous dependent services. Cache Reserve customers experienced increased origin traffic as cached objects became unavailable. 

Images and Stream services saw 100% failure rates for uploads, with delivery rates dropping to approximately 25% and 94%, respectively. 

Vectorize, Cloudflare’s vector database, experienced 75% query failure rates and complete failure for insert operations.

The ripple effects extended to Email Security, Billing, Key Transparency Auditor, and Log Delivery services, with the latter experiencing up to 70-minute delays in processing.

Cloudflare’s engineering team identified the root cause at 22:36 UTC—nearly an hour after the impact began and restored service by deploying credentials to the correct production Worker at 22:45 UTC.

Ongoing Improvements

To prevent recurrence, Cloudflare has implemented several technical and procedural changes, including:

  • Enhanced logging that explicitly identifies credential IDs used for authentication
  • Procedural requirements to verify credential usage before decommissioning old credentials
  • Mandatory use of automated hotfix release tooling rather than manual command entry
  • Explicit requirement for dual-human validation during credential rotation processes
  • Development of closed-loop health checks to validate credential propagation before releases

This incident follows another hour-long Cloudflare outage in February when an employee mistakenly disabled the entire R2 Gateway service while attempting to block a phishing URL.

Security researchers reporting the February event pointed out that the outage was caused by a lack of controls and validation checks for high-impact operations.

The recurring nature of configuration-related outages underscores the industry-wide challenge of managing complex cloud infrastructure while maintaining rigorous security practices like credential rotation.

Investigate Real-World Malicious Links & Phishing Attacks With Threat Intelligence Lookup - Try for Free

The post Cloudflare Attributes Recent Service Outage to Password Rotation Error appeared first on Cyber Security News.