Content API Best Practices. Avoid Rate Limiting and Improve Performance

What is a rate limit?

A rate limit is a mechanism to prevent the overuse or abuse of a network resource. If the volume of incoming requests exceeds a certain number in a given amount of time, an error will be returned for any requests in excess of the limit. Rate limits in Arc are enforced at or near the edge of our infrastructure and are designed to protect the systems your organization depends on to serve end users and editorial staff from becoming overwhelmed.

What is your rate limit?

The rate limit for your organization is set based on discussions with Arc XP and is related to the anticipated needs-based factors, such as the number of editorial users, publishing volume, and website traffic. If you aren’t sure what your rate limit is for a particular API endpoint, contact Arc XP Customer Support.

What can we do if we’re being rate-limited?

If you’re just wondering if your rate limit can be increased: It can. But you’ll need to make sure you’ve followed all of the advice below first. If you need a rate limit increase open a support ticket and let us know that you’ve investigated or implemented each of the following already and are still seeing issues.

There are two main factors that impact rate limiting:

Poor caching: If your queries are unnecessarily unique, exceed 1MB, or are set to cached: false then cache performance will be negatively impacted. When writing PageBuilder Engine features and content sources, it should be primary the design goal to use the fusion content cache as effectively as possible.
Slow queries: If a query takes more than 100ms to return, the query will be counted against the burst pool rather than the normal rate limit pool. This will lead to rate limit errors sooner rather than later, and is designed to prevent a large number of slow queries from piling up and causing a queue that will potentially never empty.

If you’d like to get a report on either of these, open a support ticket and request it. For slow queries, it might be helpful to get a list of requests that took more than 70ms to return, for instance.

Items to check

Below are a few of the most common items we see that lead to the above issues. Review this list carefully.

Reduce network data transfer. Content API provides the ability to exclude certain ANS fields, or to only include certain fields. Making use of this will speed up the time to transfer the data over the wire, and it will also have the positive side-effect of reducing the amount of JSON returned to client-side PageBuilder Engine features. See _sourceExclude and _sourceInclude in the Content API documentation.
If you are sorting your search by a date field, also use a filter whenever possible. For example, if you’re requesting a list of stories from the Sports section sorted by publish_date, consider whether you need to search over the entire history of your organization, or if you could also add a limit to only fetch stories from the past 30 days. Limiting the publish timeframe increases performance considerably.
Ensure that you aren’t retrieving many more stories than you need. If you have a feature that displays a list of the five latest stories for a section, make sure you aren’t requesting 40 stories. Like in the first item, this not only speeds up the search request, but can reduce the size of the payload in PageBuilder Editor and also possibly on the front end of your site.
Look to see if you have similar (nearly duplicate) content sources that return the same results. Consolidate those into the same content source to improve the cache offload in PageBuilder fetch.ex : Maybe you have the same query that returns both three and five results. Consolidate on the five result query and change the feature code. This seems contradictory the third item in this list, but really it’s about balance.
Check the values of your content source TTLs and consider if they could be increased. If no TTL is specified, the default is five minutes. If a given content source is unlikely to have changes within that period, or if it’s not critical that changes appear immediately, then there is no need to lower the TTL. The minimum TTL in PageBuilder Engine is two minutes, and it is common for organizations to always set the minimum allowable value. It’s understandable, but it doesn’t usually have any real impact for end users but does guarantee a two and a half times increase in back-end requests.
Watch out for things that paginate or otherwise change the number of items that can be requested, and are public facing. For example, if your sports section displays the 20 latest stories, but allows a user to page to the next 20, that’s great. Perhaps you accomplish this by updating the URL to have a /2/ or you make a client side call to /pf/fetch with a ?from= parameter. What happens when a bot crawls the site and walks through that pagination requesting 20 stories at a time over and over? Consider adding an upper bound to these kinds of requests in the content source to prevent overuse.
Lastly, check your queries for raw text searching. Exact matching is always much faster and should be used whenever possible.