Skip to content

Content Recommendations API Developer Guide

Introduction

This guide covers Arc XP’s Content Recommendations API, which delivers personalized content recommendations for your audience by learning from user behavior and your content catalog.

It is built to work with any Customer Data Platform (CDP) and any Content Management System (CMS), and deliver content recommendations on any content surface.

The Content Recommendations API is a personalization backend that connects your content catalog and audience behavior to a machine learning engine.

It provides two APIs:

APIPurposeBase Path
Collector APISend user behavior events and content updates/collector/v1
Recommendations APIRetrieve personalized content recommendations/recommendations/v1

The Collector API accepts behavioral signals (page views, clicks, engagement) and content lifecycle events (publish, delete) from your applications (via your CDP) and CMS.

The Recommendations API is a read-only API that returns ranked, personalized content IDs for a given user. You can then fetch that content from your content management system, including ArcXP.

Each tenant (organization) operates with a fully isolated recommendation model. Your content, your audience data, and your trained model are never shared with other tenants.

How it works

Content Recommendations API flow

  1. You send content data in — Your CMS sends content updates via webhooks. For Arc XP customers, this is accomplished via IFX.
  2. Your web/mobile apps send user behavior events (page views, clicks, engagement) through a data collection partner, such as a CDP.
  3. The Content Recommendations API learns — The ML engine trains a personalization model on your content catalog and your audience’s behavior.
  4. You fetch recommendations out — Your applications call the Recommendations API to get ranked content for a specific user.

Authentication

Both the Collector API and Recommendations API require a Delivery API token. These are separate from Developer Center tokens.

See Provisioning tokens through Delivery API for the full provisioning flow, including how to generate a bootstrap Developer Center token, create Delivery API tokens via POST /v1/access/keys, and assign them to key collections.

Critical requirements

These rules are non-negotiable. Violating them will either corrupt your recommendation model, leak sensitive data, or silently produce bad results.

  • Never send PII as the user identifier. user_id must be an anonymized token — not an email, name, phone number, or any directly identifying value. Hash or pseudonymize before sending. Arc XP does not sanitize user IDs on ingestion.

  • Keep the content catalog in sync. Only send published content via the Content endpoint, and send an action: "delete" payload the moment an item is unpublished or deleted in your CMS. Stale content in the model leads to recommendations pointing at dead pages.

  • Do not send synthetic or test traffic against real site_id values. Load tests, QA scripts, and bot traffic poison the training signal and degrade recommendation quality for real users. Scope any test activity to a dedicated, non-production site_id.

  • There is no separate Sandbox or Production instance for the Content Recommendations API — you get a single instance. Anything you send is training the one model that serves your live traffic. Plan your testing, seeding, and rollout accordingly, and use a dedicated test site_id to keep experimental data out of your production catalog.

  • Treat Delivery API tokens as secrets. Never commit them to source control or embed them in client-side code.

Quick Start (5 minutes)

The shortest path from zero to your first recommendation:

  1. Generate a Delivery API token — Follow Provisioning tokens through Delivery API to create and assign a token. Use this token as the Authorization header on every request below.

  2. Send content — POST /collector/v1/content with an action: "publish" payload for each item in your catalog. This seeds the model with what’s available to recommend.

    POST /collector/v1/content
    { "action": "publish", "item_id": "ARTICLE-001", "site_id": "my-site", "type": "article", "timestamp": "...", "title": "..." }
  3. Send events — POST /collector/v1/events as users interact with that content. At least a few events per user are needed before personalization kicks in.

    POST /collector/v1/events
    { "user_id": "user-1", "item_id": "ARTICLE-001", "event_type": "page_view", "timestamp": "..." }
  4. Fetch recommendations — GET /recommendations/v1/recommendations?site_id=my-site&user_id=user-1 returns a ranked list of item_ids you can then hydrate from your CMS.

Once this loop is in place, the remaining sections of this guide cover the full field reference, supported event types, and the item_id-anchored “more like this” mode.

Collector API

Base path: /collector/v1

The Collector API accepts two types of data: user behavior events and content lifecycle events. Both are processed asynchronously — the API responds immediately with 202 Accepted and processes data in the background.

It exposes two endpoints: the Events endpoint for user interactions and the Content endpoint for content lifecycle updates.

Events endpoint

Send user interaction events to train the recommendation model. The more behavioral data you send, the better the recommendations become.

Endpoint: POST /collector/v1/events

Response: 202 Accepted (no body)

{
"user_id": "abc-123",
"item_id": "ZSGXFR2KNFCMPN3VHPWQR3BGCE",
"event_type": "page_view",
"timestamp": "2026-03-27T14:30:00+00:00",
"session_id": null
}

Fields

FieldTypeRequiredDescription
user_idstringYesAnonymized identifier of the user who triggered the event. Can be an authenticated user ID or an ephemeral session ID for anonymous users, but it must be anonymized before sending to Arc XP.
item_idstringYesThe content item the user interacted with. Must match an item_id previously sent via the content endpoint.
event_typestringYesThe type of interaction. See Supported Event Types.
timestampstring (ISO 8601)YesWhen the interaction occurred. Must include timezone information (e.g., +00:00 or Z).
session_idstringNoSession identifier for grouping interactions within a single user visit.

Supported Event Types

Event TypeDescription
page_viewUser viewed a content detail page, such as article page.
clickUser clicked on a content link, including from recommendations modules.
shareUser shared content, such as clicked on share on Facebook.
article_saveUser saved the article to read later, such as via bookmarking.
searchUser performed an on-site search. Requires topic or keyword extraction to be relevant.
deepest_scrollThe maximum depth that the user consumed a given piece of content, expressed as a number (0.0–1.0). Do not send multiple observations for the same piece of content within the same user session.
engaged_readAn organization-defined metric that represents a reader consuming, being highly engaged with a piece of content, such as leaving a comment, spending a robust amount of time on page, etc.

Example: Page View

{
"user_id": "user-7f3a9b",
"item_id": "ARTICLE-2024-001",
"event_type": "page_view",
"timestamp": "2026-04-08T10:15:00Z"
}

Content endpoint

Keep your content catalog in sync with the Content Recommendations API. Content is sent as webhook payloads from your CMS whenever an item is published, updated, or deleted.

Endpoint: POST /collector/v1/content

Response: 202 Accepted (no body)

The request body uses a discriminated union on the action field — either "publish" (create or update) or "delete".

Publishing or Updating Content

Send this payload when a content item is first published or when it is updated.

{
"action": "publish",
"item_id": "ARTICLE-2024-001",
"site_id": "my-news-site",
"type": "article",
"timestamp": "2026-04-08T09:00:00Z",
"title": "Breaking: Major Policy Change Announced",
"categories": ["Politics", "Government"],
"tags": ["policy", "congress", "legislation"],
"author": "Jane Reporter",
"is_premium": false,
"metadata": {}
}
Fields
FieldTypeRequiredDescription
actionstringYesDiscriminator indicating a create/update operation.
item_idstringYesUnique identifier for this content item, typically from your CMS.
site_idstringYesIdentifies which website or property this content belongs to. Used to partition recommendations by site.
typestringYesContent type: such as "article" or "podcast".
timestampstring (ISO 8601)YesPublication date. Must include timezone.
titlestringYesDisplay title of the content.
categorieslist of stringsNoHigh-level taxonomy labels (e.g., "Politics", "Sports"). Defaults to empty list.
tagslist of stringsNoDetailed keywords for the content. Defaults to empty list.
authorstringNoContent creator name.
is_premiumbooleanNoWhether this content is behind a paywall. Defaults to false. Used for subscription-tier filtering in recommendations.
metadataobjectNoFlexible key-value pairs for tenant-specific fields. Defaults to empty object.

Deleting Content

Send this payload when content should be removed from recommendations.

{
"action": "delete",
"item_id": "ARTICLE-2024-001",
"site_id": "my-news-site"
}
Fields
FieldTypeRequiredDescription
action"delete"YesDiscriminator indicating a delete operation.
item_idstringYesThe item to remove.
site_idstringYesThe site the item belongs to.

Deletes are soft: the item is marked as deleted and excluded from future recommendations.

Recommendations API

Base path: /recommendations/v1

The Recommendations API returns personalized, ranked content for a given user.

Fetching Recommendations

Endpoint: GET /recommendations/v1/recommendations

Query Parameters

ParameterTypeRequiredDefaultDescription
site_idstringYes—Scopes recommendations to a specific website or property. Must match site_id values used in content ingestion.
user_idstringYes—The user to personalize for. See Identifying Users for anonymous user handling.
item_idstringNonullAnchor item for “more like this” recommendations. When provided, the response contains items similar to this one rather than general personalized recommendations.

Response

{
"recommendations": [
{
"item_id": "ARTICLE-2024-042",
"score": 0.934
},
{
"item_id": "ARTICLE-2024-089",
"score": 0.891
},
{
"item_id": "ARTICLE-2024-107",
"score": 0.847
}
]
}

Fields

FieldTypeDescription
recommendationsarrayOrdered list of recommended items, ranked by relevance.
recommendations[].item_idstringThe CMS item identifier. Use this to look up full content details from your CMS.

Example: Basic Personalized Recommendations

GET /recommendations/v1/recommendations?site_id=my-news-site&user_id=user-7f3a9b&num_results=10

Example: “More Like This” Recommendations

GET /recommendations/v1/recommendations?site_id=my-news-site&user_id=user-7f3a9b&item_id=ARTICLE-2024-042&num_results=5

Personalization and Filtering

The Content Recommendations API applies multiple layers of intelligence to produce relevant recommendations:

  • ML Personalization — The recommendation model learns from your audience’s behavior (page views, clicks, engagement) and your content metadata (categories, tags, authors, recency). Each user receives a uniquely ranked set of results based on their interaction history.
  • Site Partitioning — The site_id parameter ensures recommendations are scoped to a specific website. Content published to one site will not appear in another site’s recommendations.

Cold-Start Behavior

The Content Recommendations API handles two cold-start scenarios automatically:

New or anonymous users — When a user has no interaction history, the system returns popularity-based recommendations drawn from your content catalog. Results reflect what is trending among your broader audience, weighted by recency and content metadata. As the user accumulates interactions, recommendations progressively become more personalized.

New content — Freshly published content with no engagement data is still eligible for recommendations. The model uses content metadata (categories, tags, author, recency) to place new items in front of relevant audiences immediately.

No special handling is required on your part for either scenario. The API response shape is identical whether results are fully personalized or cold-started.

Integration Guide

Identifying Users

The user_id parameter is required for both sending events and fetching recommendations. It is your responsibility to provide a consistent, anonymized identifier for each user.

Content Sync from Your CMS

The Content Recommendations API stays in sync with your content catalog through webhooks. Configure your CMS to send POST /collector/v1/content requests whenever content is published, updated, or deleted. This provides near-real-time sync.

Arc XP CMS customers: Contact your Technical Account Manager for access to the IFX Recipe, which wires up the content webhook end-to-end without custom code.

Other CMS customers: Build a direct integration from your CMS to the Content endpoint. At minimum, your CMS (or an intermediary service) must:

  • Listen for publish, update, and delete events in your CMS.
  • Transform each event into the appropriate action: "publish" or action: "delete" payload described under Content endpoint.
  • POST the payload to /collector/v1/content with your Delivery API token in the Authorization header.
  • Handle retries on transport-level failures (connection errors, 5xx responses). Do not retry on 202 Accepted — that means the payload was accepted for async processing.

Whichever path you take, the goal is the same: every publish, update, and unpublish in your CMS must reach the Content endpoint, ideally within seconds.

Displaying Recommendations

The Recommendations API returns item_id values and scores.

To display recommendations to users:

  1. Call GET /recommendations/v1/recommendations with the appropriate parameters.
  2. Use the returned item_id values to fetch full content details (title, thumbnail, URL, etc.) from your CMS.
  3. Display items in the order returned — the list is already ranked by relevance.
  4. Send a click event back to the Events endpoint when a user clicks on a recommendation. This feedback loop improves future recommendations.

Onboarding for good recommendations on day one

The Quick Start gets the plumbing working. This section covers how to avoid launching with a cold, low-signal model that produces generic results for your first real users.

A freshly provisioned model knows nothing about your catalog or your audience. If you flip on recommendations in a live surface the moment the Quick Start completes, readers will see popularity-based cold-start results — not personalization. The goal of onboarding is to front-load as much content and behavioral signal as possible before recommendations are user-visible.

Bulk-load your existing catalog

Before going live, send an action: "publish" payload for every published item in your catalog — not just new publishes going forward.

  • Pull your full content inventory from your CMS and stream it through POST /collector/v1/content.
  • Include accurate categories, tags, author, and timestamp values on every item. Metadata is what carries new content through cold-start, so sparse metadata here will hurt you later.
  • Once the backfill is done, wire up your live CMS webhook (or the IFX recipe for Arc XP customers) so ongoing publishes, updates, and deletes stay in sync automatically.

Backfill historical behavioral events

Events are the other half of the training signal. If your CDP or analytics system retains historical user interactions, replay them through POST /collector/v1/events before launch.

  • Target at least 30–90 days of historical events where available. More is better, but quality matters more than absolute volume — send the richer event types (click, article_save, deepest_scroll, engaged_read) in addition to page_views.
  • Preserve the original timestamp on each event rather than stamping everything with the backfill time. The model uses recency as a signal.
  • Keep the same anonymization scheme for historical user_ids as you’ll use in production, so users who exist in both the backfill and live traffic are recognized as the same person.

Shadow-deploy the event collector first

Turn on event collection in your live applications well before you expose recommendations to users.

  • Instrument page views, clicks, and engagement events against the real production site_id as soon as you’re confident in your payload shape and anonymization.
  • Let the collector run for a meaningful window — typically a few weeks — so the model trains on real traffic patterns rather than only historical backfill.
  • During this window, you can call GET /recommendations/v1/recommendations internally (QA builds, staging surfaces, internal dashboards) to spot-check relevance without any reader-facing risk.

Validate before cutting over

Before exposing recommendations on a user-facing surface:

  • Sample recommendation responses for a diverse set of user_ids — new users, heavy readers, anonymous sessions — and manually review whether results look on-topic for each user’s history.
  • Confirm the returned item_ids resolve cleanly in your CMS and aren’t pointing at deleted or unpublished (or non-existent) content your delivery layer would reject.
  • Consider a gradual rollout (a percentage of traffic, a single surface, or a single site) rather than flipping recommendations on everywhere at once. This gives you a real-world quality signal with a small blast radius.

Troubleshooting

Use this section when recommendations aren’t behaving as expected. Most issues trace back to catalog sync, event volume, or cold-start behavior being misread as a bug.

No recommendations returned

If the response comes back with an empty recommendations array, the model has nothing to rank for that user and site.

  • Check content ingestion first. Query your CMS integration or webhook logs to confirm POST /collector/v1/content calls are succeeding. A model with no catalog cannot return anything.
  • Confirm site_id matches. The site_id on the recommendations request must exactly match the site_id used during content ingestion. A typo silently partitions your catalog into an empty sub-model.
  • Check for over-aggressive deletes. If your CMS sends action: "delete" for items that are still live, the model excludes them from results. Audit your unpublish / delete pipeline.

Recommendations look low-relevance or generic

If the API returns results but they feel random or generic, the model likely doesn’t have enough behavioral signal yet.

  • Check event volume. The model improves with more interactions. If you’re only sending page_view events — or only sending them for a small fraction of your traffic — personalization quality will be weak. Add richer event types (click, article_save, deepest_scroll, engaged_read) where appropriate.
  • Verify user_id consistency. If the same person shows up under different user_id values across sessions, the model can’t accumulate history on them. Confirm your anonymization scheme produces stable IDs per user.
  • Check content metadata quality. Missing categories, tags, or author values reduce what the model can reason about, especially for new content that has no engagement signal yet.

New user or new content looks “cold”

Cold-start is expected behavior, not a bug.

  • New or anonymous users receive popularity-based recommendations until they accumulate interaction history. As events stream in, results progressively personalize.
  • Newly published content is still eligible immediately — the model uses metadata (categories, tags, recency) to place it in front of relevant audiences even with zero engagement data.
  • If you’re evaluating the API with a brand-new tenant, expect the first wave of results to look generic. Seed the model with historical content and a representative volume of events before drawing conclusions about relevance.