Content Recommendations API Developer Guide
Introduction
This guide covers Arc XP’s Content Recommendations API, which delivers personalized content recommendations for your audience by learning from user behavior and your content catalog.
It is built to work with any Customer Data Platform (CDP) and any Content Management System (CMS), and deliver content recommendations on any content surface.
The Content Recommendations API is a personalization backend that connects your content catalog and audience behavior to a machine learning engine.
It provides two APIs:
| API | Purpose | Base Path |
|---|---|---|
| Collector API | Send user behavior events and content updates | /collector/v1 |
| Recommendations API | Retrieve personalized content recommendations | /recommendations/v1 |
The Collector API accepts behavioral signals (page views, clicks, engagement) and content lifecycle events (publish, delete) from your applications (via your CDP) and CMS.
The Recommendations API is a read-only API that returns ranked, personalized content IDs for a given user. You can then fetch that content from your content management system, including ArcXP.
Each tenant (organization) operates with a fully isolated recommendation model. Your content, your audience data, and your trained model are never shared with other tenants.
How it works

- You send content data in — Your CMS sends content updates via webhooks. For Arc XP customers, this is accomplished via IFX.
- Your web/mobile apps send user behavior events (page views, clicks, engagement) through a data collection partner, such as a CDP.
- The Content Recommendations API learns — The ML engine trains a personalization model on your content catalog and your audience’s behavior.
- You fetch recommendations out — Your applications call the Recommendations API to get ranked content for a specific user.
Authentication
Both the Collector API and Recommendations API require a Delivery API token. These are separate from Developer Center tokens.
See Provisioning tokens through Delivery API for the full provisioning flow, including how to generate a bootstrap Developer Center token, create Delivery API tokens via POST /v1/access/keys, and assign them to key collections.
Critical requirements
These rules are non-negotiable. Violating them will either corrupt your recommendation model, leak sensitive data, or silently produce bad results.
-
Never send PII as the user identifier.
user_idmust be an anonymized token — not an email, name, phone number, or any directly identifying value. Hash or pseudonymize before sending. Arc XP does not sanitize user IDs on ingestion. -
Keep the content catalog in sync. Only send published content via the Content endpoint, and send an
action: "delete"payload the moment an item is unpublished or deleted in your CMS. Stale content in the model leads to recommendations pointing at dead pages. -
Do not send synthetic or test traffic against real
site_idvalues. Load tests, QA scripts, and bot traffic poison the training signal and degrade recommendation quality for real users. Scope any test activity to a dedicated, non-productionsite_id. -
There is no separate Sandbox or Production instance for the Content Recommendations API — you get a single instance. Anything you send is training the one model that serves your live traffic. Plan your testing, seeding, and rollout accordingly, and use a dedicated test
site_idto keep experimental data out of your production catalog. -
Treat Delivery API tokens as secrets. Never commit them to source control or embed them in client-side code.
Quick Start (5 minutes)
The shortest path from zero to your first recommendation:
-
Generate a Delivery API token — Follow Provisioning tokens through Delivery API to create and assign a token. Use this token as the
Authorizationheader on every request below. -
Send content —
POST /collector/v1/contentwith anaction: "publish"payload for each item in your catalog. This seeds the model with what’s available to recommend.POST /collector/v1/content{ "action": "publish", "item_id": "ARTICLE-001", "site_id": "my-site", "type": "article", "timestamp": "...", "title": "..." } -
Send events —
POST /collector/v1/eventsas users interact with that content. At least a few events per user are needed before personalization kicks in.POST /collector/v1/events{ "user_id": "user-1", "item_id": "ARTICLE-001", "event_type": "page_view", "timestamp": "..." } -
Fetch recommendations —
GET /recommendations/v1/recommendations?site_id=my-site&user_id=user-1returns a ranked list ofitem_ids you can then hydrate from your CMS.
Once this loop is in place, the remaining sections of this guide cover the full field reference, supported event types, and the item_id-anchored “more like this” mode.
Collector API
Base path: /collector/v1
The Collector API accepts two types of data: user behavior events and content lifecycle events. Both are processed asynchronously — the API responds immediately with 202 Accepted and processes data in the background.
It exposes two endpoints: the Events endpoint for user interactions and the Content endpoint for content lifecycle updates.
Events endpoint
Send user interaction events to train the recommendation model. The more behavioral data you send, the better the recommendations become.
Endpoint: POST /collector/v1/events
Response: 202 Accepted (no body)
{ "user_id": "abc-123", "item_id": "ZSGXFR2KNFCMPN3VHPWQR3BGCE", "event_type": "page_view", "timestamp": "2026-03-27T14:30:00+00:00", "session_id": null}Fields
| Field | Type | Required | Description |
|---|---|---|---|
user_id | string | Yes | Anonymized identifier of the user who triggered the event. Can be an authenticated user ID or an ephemeral session ID for anonymous users, but it must be anonymized before sending to Arc XP. |
item_id | string | Yes | The content item the user interacted with. Must match an item_id previously sent via the content endpoint. |
event_type | string | Yes | The type of interaction. See Supported Event Types. |
timestamp | string (ISO 8601) | Yes | When the interaction occurred. Must include timezone information (e.g., +00:00 or Z). |
session_id | string | No | Session identifier for grouping interactions within a single user visit. |
Supported Event Types
| Event Type | Description |
|---|---|
page_view | User viewed a content detail page, such as article page. |
click | User clicked on a content link, including from recommendations modules. |
share | User shared content, such as clicked on share on Facebook. |
article_save | User saved the article to read later, such as via bookmarking. |
search | User performed an on-site search. Requires topic or keyword extraction to be relevant. |
deepest_scroll | The maximum depth that the user consumed a given piece of content, expressed as a number (0.0–1.0). Do not send multiple observations for the same piece of content within the same user session. |
engaged_read | An organization-defined metric that represents a reader consuming, being highly engaged with a piece of content, such as leaving a comment, spending a robust amount of time on page, etc. |
Example: Page View
{ "user_id": "user-7f3a9b", "item_id": "ARTICLE-2024-001", "event_type": "page_view", "timestamp": "2026-04-08T10:15:00Z"}Content endpoint
Keep your content catalog in sync with the Content Recommendations API. Content is sent as webhook payloads from your CMS whenever an item is published, updated, or deleted.
Endpoint: POST /collector/v1/content
Response: 202 Accepted (no body)
The request body uses a discriminated union on the action field — either "publish" (create or update) or "delete".
Publishing or Updating Content
Send this payload when a content item is first published or when it is updated.
{ "action": "publish", "item_id": "ARTICLE-2024-001", "site_id": "my-news-site", "type": "article", "timestamp": "2026-04-08T09:00:00Z", "title": "Breaking: Major Policy Change Announced", "categories": ["Politics", "Government"], "tags": ["policy", "congress", "legislation"], "author": "Jane Reporter", "is_premium": false, "metadata": {}}Fields
| Field | Type | Required | Description |
|---|---|---|---|
action | string | Yes | Discriminator indicating a create/update operation. |
item_id | string | Yes | Unique identifier for this content item, typically from your CMS. |
site_id | string | Yes | Identifies which website or property this content belongs to. Used to partition recommendations by site. |
type | string | Yes | Content type: such as "article" or "podcast". |
timestamp | string (ISO 8601) | Yes | Publication date. Must include timezone. |
title | string | Yes | Display title of the content. |
categories | list of strings | No | High-level taxonomy labels (e.g., "Politics", "Sports"). Defaults to empty list. |
tags | list of strings | No | Detailed keywords for the content. Defaults to empty list. |
author | string | No | Content creator name. |
is_premium | boolean | No | Whether this content is behind a paywall. Defaults to false. Used for subscription-tier filtering in recommendations. |
metadata | object | No | Flexible key-value pairs for tenant-specific fields. Defaults to empty object. |
Deleting Content
Send this payload when content should be removed from recommendations.
{ "action": "delete", "item_id": "ARTICLE-2024-001", "site_id": "my-news-site"}Fields
| Field | Type | Required | Description |
|---|---|---|---|
action | "delete" | Yes | Discriminator indicating a delete operation. |
item_id | string | Yes | The item to remove. |
site_id | string | Yes | The site the item belongs to. |
Deletes are soft: the item is marked as deleted and excluded from future recommendations.
Recommendations API
Base path: /recommendations/v1
The Recommendations API returns personalized, ranked content for a given user.
Fetching Recommendations
Endpoint: GET /recommendations/v1/recommendations
Query Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
site_id | string | Yes | — | Scopes recommendations to a specific website or property. Must match site_id values used in content ingestion. |
user_id | string | Yes | — | The user to personalize for. See Identifying Users for anonymous user handling. |
item_id | string | No | null | Anchor item for “more like this” recommendations. When provided, the response contains items similar to this one rather than general personalized recommendations. |
Response
{ "recommendations": [ { "item_id": "ARTICLE-2024-042", "score": 0.934 }, { "item_id": "ARTICLE-2024-089", "score": 0.891 }, { "item_id": "ARTICLE-2024-107", "score": 0.847 } ]}Fields
| Field | Type | Description |
|---|---|---|
recommendations | array | Ordered list of recommended items, ranked by relevance. |
recommendations[].item_id | string | The CMS item identifier. Use this to look up full content details from your CMS. |
Example: Basic Personalized Recommendations
GET /recommendations/v1/recommendations?site_id=my-news-site&user_id=user-7f3a9b&num_results=10
Example: “More Like This” Recommendations
GET /recommendations/v1/recommendations?site_id=my-news-site&user_id=user-7f3a9b&item_id=ARTICLE-2024-042&num_results=5
Personalization and Filtering
The Content Recommendations API applies multiple layers of intelligence to produce relevant recommendations:
- ML Personalization — The recommendation model learns from your audience’s behavior (page views, clicks, engagement) and your content metadata (categories, tags, authors, recency). Each user receives a uniquely ranked set of results based on their interaction history.
- Site Partitioning — The
site_idparameter ensures recommendations are scoped to a specific website. Content published to one site will not appear in another site’s recommendations.
Cold-Start Behavior
The Content Recommendations API handles two cold-start scenarios automatically:
New or anonymous users — When a user has no interaction history, the system returns popularity-based recommendations drawn from your content catalog. Results reflect what is trending among your broader audience, weighted by recency and content metadata. As the user accumulates interactions, recommendations progressively become more personalized.
New content — Freshly published content with no engagement data is still eligible for recommendations. The model uses content metadata (categories, tags, author, recency) to place new items in front of relevant audiences immediately.
No special handling is required on your part for either scenario. The API response shape is identical whether results are fully personalized or cold-started.
Integration Guide
Identifying Users
The user_id parameter is required for both sending events and fetching recommendations. It is your responsibility to provide a consistent, anonymized identifier for each user.
Content Sync from Your CMS
The Content Recommendations API stays in sync with your content catalog through webhooks. Configure your CMS to send POST /collector/v1/content requests whenever content is published, updated, or deleted. This provides near-real-time sync.
Arc XP CMS customers: Contact your Technical Account Manager for access to the IFX Recipe, which wires up the content webhook end-to-end without custom code.
Other CMS customers: Build a direct integration from your CMS to the Content endpoint. At minimum, your CMS (or an intermediary service) must:
- Listen for publish, update, and delete events in your CMS.
- Transform each event into the appropriate
action: "publish"oraction: "delete"payload described under Content endpoint. POSTthe payload to/collector/v1/contentwith your Delivery API token in theAuthorizationheader.- Handle retries on transport-level failures (connection errors, 5xx responses). Do not retry on
202 Accepted— that means the payload was accepted for async processing.
Whichever path you take, the goal is the same: every publish, update, and unpublish in your CMS must reach the Content endpoint, ideally within seconds.
Displaying Recommendations
The Recommendations API returns item_id values and scores.
To display recommendations to users:
- Call
GET /recommendations/v1/recommendationswith the appropriate parameters. - Use the returned
item_idvalues to fetch full content details (title, thumbnail, URL, etc.) from your CMS. - Display items in the order returned — the list is already ranked by relevance.
- Send a
clickevent back to the Events endpoint when a user clicks on a recommendation. This feedback loop improves future recommendations.
Onboarding for good recommendations on day one
The Quick Start gets the plumbing working. This section covers how to avoid launching with a cold, low-signal model that produces generic results for your first real users.
A freshly provisioned model knows nothing about your catalog or your audience. If you flip on recommendations in a live surface the moment the Quick Start completes, readers will see popularity-based cold-start results — not personalization. The goal of onboarding is to front-load as much content and behavioral signal as possible before recommendations are user-visible.
Bulk-load your existing catalog
Before going live, send an action: "publish" payload for every published item in your catalog — not just new publishes going forward.
- Pull your full content inventory from your CMS and stream it through
POST /collector/v1/content. - Include accurate
categories,tags,author, andtimestampvalues on every item. Metadata is what carries new content through cold-start, so sparse metadata here will hurt you later. - Once the backfill is done, wire up your live CMS webhook (or the IFX recipe for Arc XP customers) so ongoing publishes, updates, and deletes stay in sync automatically.
Backfill historical behavioral events
Events are the other half of the training signal. If your CDP or analytics system retains historical user interactions, replay them through POST /collector/v1/events before launch.
- Target at least 30–90 days of historical events where available. More is better, but quality matters more than absolute volume — send the richer event types (
click,article_save,deepest_scroll,engaged_read) in addition topage_views. - Preserve the original
timestampon each event rather than stamping everything with the backfill time. The model uses recency as a signal. - Keep the same anonymization scheme for historical
user_ids as you’ll use in production, so users who exist in both the backfill and live traffic are recognized as the same person.
Shadow-deploy the event collector first
Turn on event collection in your live applications well before you expose recommendations to users.
- Instrument page views, clicks, and engagement events against the real production
site_idas soon as you’re confident in your payload shape and anonymization. - Let the collector run for a meaningful window — typically a few weeks — so the model trains on real traffic patterns rather than only historical backfill.
- During this window, you can call
GET /recommendations/v1/recommendationsinternally (QA builds, staging surfaces, internal dashboards) to spot-check relevance without any reader-facing risk.
Validate before cutting over
Before exposing recommendations on a user-facing surface:
- Sample recommendation responses for a diverse set of
user_ids — new users, heavy readers, anonymous sessions — and manually review whether results look on-topic for each user’s history. - Confirm the returned
item_ids resolve cleanly in your CMS and aren’t pointing at deleted or unpublished (or non-existent) content your delivery layer would reject. - Consider a gradual rollout (a percentage of traffic, a single surface, or a single site) rather than flipping recommendations on everywhere at once. This gives you a real-world quality signal with a small blast radius.
Troubleshooting
Use this section when recommendations aren’t behaving as expected. Most issues trace back to catalog sync, event volume, or cold-start behavior being misread as a bug.
No recommendations returned
If the response comes back with an empty recommendations array, the model has nothing to rank for that user and site.
- Check content ingestion first. Query your CMS integration or webhook logs to confirm
POST /collector/v1/contentcalls are succeeding. A model with no catalog cannot return anything. - Confirm
site_idmatches. Thesite_idon the recommendations request must exactly match thesite_idused during content ingestion. A typo silently partitions your catalog into an empty sub-model. - Check for over-aggressive deletes. If your CMS sends
action: "delete"for items that are still live, the model excludes them from results. Audit your unpublish / delete pipeline.
Recommendations look low-relevance or generic
If the API returns results but they feel random or generic, the model likely doesn’t have enough behavioral signal yet.
- Check event volume. The model improves with more interactions. If you’re only sending
page_viewevents — or only sending them for a small fraction of your traffic — personalization quality will be weak. Add richer event types (click,article_save,deepest_scroll,engaged_read) where appropriate. - Verify
user_idconsistency. If the same person shows up under differentuser_idvalues across sessions, the model can’t accumulate history on them. Confirm your anonymization scheme produces stable IDs per user. - Check content metadata quality. Missing
categories,tags, orauthorvalues reduce what the model can reason about, especially for new content that has no engagement signal yet.
New user or new content looks “cold”
Cold-start is expected behavior, not a bug.
- New or anonymous users receive popularity-based recommendations until they accumulate interaction history. As events stream in, results progressively personalize.
- Newly published content is still eligible immediately — the model uses metadata (categories, tags, recency) to place it in front of relevant audiences even with zero engagement data.
- If you’re evaluating the API with a brand-new tenant, expect the first wave of results to look generic. Seed the model with historical content and a representative volume of events before drawing conclusions about relevance.