Skip to content

Arc XP's Content Delivery Network (CDN) Caching

This technical deep dive explains how Arc XP’s CDN caches and serves content securely through its global edge network.

If you’re not yet familiar with the high-level overview of Arc XP’s render stack, how request flows from your reader’s request to downstream services, see Caching within Arc XP

This guide focuses specifically on the CDN layer with our render stack. It does not cover the underlying rendering process, PageBuilder cache, or the impact on downstream APIs.

Arc XP’s CDN is a global network of servers designed to efficiently cache and distribute content across more than 4,200 edge instances in 130 countries. While sophisticated, the inner workings of this network can be difficult to fully understand.

To illustrate how this works, let’s start with a simple example: a user requests a page. We’ll explore how the request flows through Arc XP’s CDN, using a single page as our example.

When a user makes a request, it is routed to the closest Edge node in their region. A region may have multiple Edge nodes, and one is selected to handle the request.

If a valid, non-expired cache is found on the selected Edge node, the page is served with very low latency. If the cache is not found, the Edge node queries its parent node.

Think of a parent node as the group leader for that specific region or location. If the parent node has a valid, non-expired cache, it is served to the user. As the cache is served, the Edge node stores the object in its cache for future requests. If another user makes the same request in that region and is assigned to the same Edge node, the cached object is used. The Edge node respects the original cache expiration time, ensuring the object expires at the same time on both the Edge node and its parent node.

If the parent node also does not have a valid, non-expired cache, the system must process a render. The request is routed through the Cloud Wrapper layer, which directs the request to the web origin and then to PageBuilder. When PageBuilder renders the page, the flow reverses, returning to the parent and Edge nodes. The rendered page is then cached at both the parent and Edge nodes.

At this point, any subsequent requests to the same Edge node do not require rendering. An Edge node can serve multiple users simultaneously. The Cloud Wrapper and parent nodes balance overall traffic across different nodes, considering the user’s proximity and the current load on each node. The key point is that the specific Edge node is highly efficient as long as its cache remains valid, and it won’t need to re-render the same request.

As you might expect, a region contains many Edge nodes, and each parent node controls multiple child Edge nodes to balance traffic. As long as the parent node has a valid cached object, all the Edge nodes replicates it as its own copy until all of it expires.

Zooming out further, the global network spans many regions, each following the same pattern.

Given the large number of regions and Edge node, the system efficiently manages render volumes and downstream API load.

Edge nodes and parent nodes can be swapped at any time. Sophisticated monitoring, health checks, and failover systems automatically manage these transitions. As a result, the path of a user’s request through the network is not predictable.

Arc XP CDN FAQs

If my render fails, what would the CDN serve to readers?

If a render fails, whether due to a bundle code issue, a downstream API error (such as a 429 rate limit) or other issues that prevent rendering, PageBuilder returns a 4xx or 5xx response.

Cloud Wrapper handles these errors with resiliency logic. In such cases, a cache back-off strategy is triggered, instructing the parent and Edge nodes to use the last valid cached object for the request. Your readers continues to get last cached content even though underlying systems may face errors.

A common scenario about this is, the code in global content source starts having errors. This causes page render to completely fail. The root cause of the error may be a recent code deployment that introduced a bug in production causing failures, or downstream APIs start having issues like rate limiting, or authentication errors. Arc XP CDN masks these errors and continues to serve your readers with previously cached content.

Why am I seeing a 500 error if the CDN covers my render failure?

In the previous FAQ, we explained that failures automatically fall back to the last valid cache object. However, if neither the Edge node nor the parent node has a valid cache object, the system returns the origin (PageBuilder) response to the user, which may result in a 500 error.

The example scenario we gave in the previous FAQ assumes that an Edge Node or it’s Parent Node previously cached a page. But what happens if your reader, or a crawler is requesting an infrequently accessed page that was never cached in that region. Or If CDN network replaced both parent and edge node due to it’s performance (or other reasons), where there is no cache. It rarely happens, but it can happen. In this case there is no other alternative content CDN can cover the source error and it will return the downstream render error. This can be 500 application errors or other HTTP error codes.

Why are requests from two devices on the same wifi network seeing different content on a website?

The CDN network manages incoming requests by mapping them to different Edge nodes.

Even if two devices access the same page at the exact same time, or even on the same device; the CDN may route their requests to different nodes, each with slightly different cache expiration times.

As a result, one node might have cached content from 30 seconds ago, while another node could server content after the cache was updated, potentially displaying different stories if new content was published in the interim.