Content Recommendations Onboarding Checklist

A practical, phase-by-phase checklist for taking Content Recommendations from zero to live traffic without launching on a cold model.

About this checklist

This guide is a companion to the Content Recommendations API Developer Guide and the Content Recommendations User Experience Guide. The developer guide covers what the API does; this checklist covers when to do each step in a rollout.

The phases are sequential.

Pre-launch

The goal of this phase is to have a fully populated catalog, a meaningful behavioral signal, and a working integration.

Access and credentials

Identify the team or service that will own the integration end-to-end.
Generate a Delivery API token via Provisioning tokens through Delivery API.
Store the token in your secrets manager. Never commit it to source control or embed it in client-side code.
Confirm the token is assigned to the correct key collection for both the Collector and Recommendations APIs.

Identifiers and anonymization

Decide on your user_id scheme. It must be anonymized: never an email, name, or other PII.
Confirm the same user_id will be produced for the same person across sessions, devices (where possible), and between historical backfill and live traffic.
Decide on your item_id scheme. Use a stable CMS ID or URL slug — not a full URL. Arc XP customers should use the content’s ArcID.
Confirm the same item_id will be used in both content payloads and behavioral events. The model joins on this field.
Pick a dedicated test site_id to use for any QA or load testing. Never test against your production site_id.

Content catalog

Bulk-load every published item in your catalog via POST /collector/v1/content with action: "publish".
Include accurate categories, tags, author, and timestamp on every item. Sparse metadata hurts cold-start quality for new content later.
Wire up the live CMS webhook (or the IFX Recipe for Arc XP CMS customers) so ongoing publishes, updates, and deletes stay in sync automatically.
Verify your unpublish/delete pipeline sends action: "delete" payloads. Stale catalog entries lead to recommendations pointing at dead pages.

Behavioral signal

Replay 30–90 days of historical events through POST /collector/v1/events if your CDP or analytics system retains them.
Preserve the original timestamp on each historical event — do not stamp them with the backfill time.
Send richer event types (click, article_save, deepest_scroll, engaged_read) in addition to page_view where you have them.
Filter internal employee traffic out of any event stream before sending. Editor and newsroom traffic skews the model away from real reader interest.
Filter known bot and crawler traffic out of any event stream before sending.

Shadow-deploy the collector

Instrument live event collection in your apps against the real production site_id while recommendations remain hidden from users.
Let the collector run for a meaningful window — typically a few weeks — so the model trains on real traffic patterns alongside the historical backfill.
Spot-check GET /recommendations/v1/recommendations from internal surfaces (QA builds, staging, internal dashboards) during this window.

Pre-launch validation

Sample recommendations for a diverse set of user_ids — new users, heavy readers, anonymous sessions — and manually review whether results look on-topic.
Confirm returned item_ids resolve cleanly in your CMS and aren’t pointing at deleted, unpublished, or non-existent content.
Stress-test your hydration layer. The API returns ranked IDs; your front-end is responsible for fetching the renderable content.
Decide on a rollout strategy: percentage of traffic, single surface, or single site first. Avoid flipping recommendations on everywhere at once.

Launch

This phase is the cutover itself. Keep it small, observable, and reversible.

Day-of cutover

Enable recommendations on the chosen launch surface and traffic slice only.
Confirm the front-end is sending click events back to the Events endpoint when readers click a recommended item. This feedback loop drives quality improvement.
Watch error rates and latency on both the Collector and Recommendations APIs in real time for the first hour.
Spot-check live recommendations for several real user sessions immediately after enabling.

Health checks during the rollout window

Confirm POST /collector/v1/events traffic volume matches expected page-view and click volume from your analytics. A large gap usually means an instrumentation bug.
Confirm POST /collector/v1/content is firing on every publish, update, and delete from your CMS.
Check that recommendations array lengths look reasonable across users — empty arrays for many users typically indicate a site_id mismatch.
Keep a rollback path warm. If anything looks wrong, you should be able to disable the surface or reduce the traffic slice in minutes, not hours.

Expand the rollout

Increase the traffic slice in stages once health checks are stable. Common ramps are 5% → 25% → 50% → 100%, with a hold period at each step.
Add additional surfaces (article-page rails, homepage modules, email, etc.) one at a time. See the User Experience Guide for surface-specific guidance.
Avoid launching multiple new surfaces in the same week unless you have separate observability for each.

Post-launch

Once recommendations are live for real readers, the work shifts from standing it up to keeping it healthy and making it better.

Ongoing data hygiene

Monitor that the CMS → Content endpoint pipeline stays in sync. A silent webhook failure for even a day or two will bias recommendations toward stale content.
Audit your delete payloads quarterly. Over-aggressive deletes silently shrink the recommendable catalog.
Re-confirm your bot and employee filters periodically. Both tend to drift as your traffic mix and internal tooling evolve.
Re-validate user_id stability. If session-scoped IDs leak in alongside authenticated IDs, the model can’t accumulate history per reader.

Quality monitoring

Track click-through rate on recommendation surfaces against your pre-launch baseline. Surface-level CTR is the cleanest read on whether the model is earning its keep.
Track hydration failure rate (recommended item_ids that fail to resolve in the CMS). A nonzero number indicates a delete-pipeline gap.
Sample recommendations for a rotating set of user_ids on a recurring cadence — the same review you did pre-launch, but now with real engagement data behind it.
Treat empty-recommendation responses as a signal to investigate, not a bug to suppress. Common causes are site_id mismatches and over-aggressive deletes — see the Troubleshooting section of the API Developer Guide.

Iterate on surfaces and signals

Add richer event types over time if you launched with page_view and click only. deepest_scroll, article_save, and engaged_read materially improve personalization quality.
Expand to new surfaces from the User Experience Guide once an existing surface is healthy and instrumented.
Revisit content metadata coverage. Pages missing categories, tags, or author reduce what the model can reason about — especially for new content.
Schedule a recurring (quarterly is a reasonable default) review of recommendation quality with editorial and product stakeholders.

Token and access maintenance

Rotate Delivery API tokens on whatever cadence your security policy requires.
Remove tokens that are no longer in use via DELETE /v1/access/keys/{key_id}. The 30-token-per-environment limit is easy to hit if old tokens linger.

Where to go next

Content Recommendations API Developer Guide — endpoint reference, authentication, field-level details, and troubleshooting.
Content Recommendations User Experience Guide — surface-by-surface guidance for where to deploy recommendations and what each placement is good at.