Skip to content

Content Recommendations Onboarding Checklist

A practical, phase-by-phase checklist for taking Content Recommendations from zero to live traffic without launching on a cold model.

About this checklist

This guide is a companion to the Content Recommendations API Developer Guide and the Content Recommendations User Experience Guide. The developer guide covers what the API does; this checklist covers when to do each step in a rollout.

The phases are sequential.

Pre-launch

The goal of this phase is to have a fully populated catalog, a meaningful behavioral signal, and a working integration.

Access and credentials

  • Identify the team or service that will own the integration end-to-end.
  • Generate a Delivery API token via Provisioning tokens through Delivery API.
  • Store the token in your secrets manager. Never commit it to source control or embed it in client-side code.
  • Confirm the token is assigned to the correct key collection for both the Collector and Recommendations APIs.

Identifiers and anonymization

  • Decide on your user_id scheme. It must be anonymized: never an email, name, or other PII.
  • Confirm the same user_id will be produced for the same person across sessions, devices (where possible), and between historical backfill and live traffic.
  • Decide on your item_id scheme. Use a stable CMS ID or URL slug β€” not a full URL. Arc XP customers should use the content’s ArcID.
  • Confirm the same item_id will be used in both content payloads and behavioral events. The model joins on this field.
  • Pick a dedicated test site_id to use for any QA or load testing. Never test against your production site_id.

Content catalog

  • Bulk-load every published item in your catalog via POST /collector/v1/content with action: "publish".
  • Include accurate categories, tags, author, and timestamp on every item. Sparse metadata hurts cold-start quality for new content later.
  • Wire up the live CMS webhook (or the IFX Recipe for Arc XP CMS customers) so ongoing publishes, updates, and deletes stay in sync automatically.
  • Verify your unpublish/delete pipeline sends action: "delete" payloads. Stale catalog entries lead to recommendations pointing at dead pages.

Behavioral signal

  • Replay 30–90 days of historical events through POST /collector/v1/events if your CDP or analytics system retains them.
  • Preserve the original timestamp on each historical event β€” do not stamp them with the backfill time.
  • Send richer event types (click, article_save, deepest_scroll, engaged_read) in addition to page_view where you have them.
  • Filter internal employee traffic out of any event stream before sending. Editor and newsroom traffic skews the model away from real reader interest.
  • Filter known bot and crawler traffic out of any event stream before sending.

Shadow-deploy the collector

  • Instrument live event collection in your apps against the real production site_id while recommendations remain hidden from users.
  • Let the collector run for a meaningful window β€” typically a few weeks β€” so the model trains on real traffic patterns alongside the historical backfill.
  • Spot-check GET /recommendations/v1/recommendations from internal surfaces (QA builds, staging, internal dashboards) during this window.

Pre-launch validation

  • Sample recommendations for a diverse set of user_ids β€” new users, heavy readers, anonymous sessions β€” and manually review whether results look on-topic.
  • Confirm returned item_ids resolve cleanly in your CMS and aren’t pointing at deleted, unpublished, or non-existent content.
  • Stress-test your hydration layer. The API returns ranked IDs; your front-end is responsible for fetching the renderable content.
  • Decide on a rollout strategy: percentage of traffic, single surface, or single site first. Avoid flipping recommendations on everywhere at once.

Launch

This phase is the cutover itself. Keep it small, observable, and reversible.

Day-of cutover

  • Enable recommendations on the chosen launch surface and traffic slice only.
  • Confirm the front-end is sending click events back to the Events endpoint when readers click a recommended item. This feedback loop drives quality improvement.
  • Watch error rates and latency on both the Collector and Recommendations APIs in real time for the first hour.
  • Spot-check live recommendations for several real user sessions immediately after enabling.

Health checks during the rollout window

  • Confirm POST /collector/v1/events traffic volume matches expected page-view and click volume from your analytics. A large gap usually means an instrumentation bug.
  • Confirm POST /collector/v1/content is firing on every publish, update, and delete from your CMS.
  • Check that recommendations array lengths look reasonable across users β€” empty arrays for many users typically indicate a site_id mismatch.
  • Keep a rollback path warm. If anything looks wrong, you should be able to disable the surface or reduce the traffic slice in minutes, not hours.

Expand the rollout

  • Increase the traffic slice in stages once health checks are stable. Common ramps are 5% β†’ 25% β†’ 50% β†’ 100%, with a hold period at each step.
  • Add additional surfaces (article-page rails, homepage modules, email, etc.) one at a time. See the User Experience Guide for surface-specific guidance.
  • Avoid launching multiple new surfaces in the same week unless you have separate observability for each.

Post-launch

Once recommendations are live for real readers, the work shifts from standing it up to keeping it healthy and making it better.

Ongoing data hygiene

  • Monitor that the CMS β†’ Content endpoint pipeline stays in sync. A silent webhook failure for even a day or two will bias recommendations toward stale content.
  • Audit your delete payloads quarterly. Over-aggressive deletes silently shrink the recommendable catalog.
  • Re-confirm your bot and employee filters periodically. Both tend to drift as your traffic mix and internal tooling evolve.
  • Re-validate user_id stability. If session-scoped IDs leak in alongside authenticated IDs, the model can’t accumulate history per reader.

Quality monitoring

  • Track click-through rate on recommendation surfaces against your pre-launch baseline. Surface-level CTR is the cleanest read on whether the model is earning its keep.
  • Track hydration failure rate (recommended item_ids that fail to resolve in the CMS). A nonzero number indicates a delete-pipeline gap.
  • Sample recommendations for a rotating set of user_ids on a recurring cadence β€” the same review you did pre-launch, but now with real engagement data behind it.
  • Treat empty-recommendation responses as a signal to investigate, not a bug to suppress. Common causes are site_id mismatches and over-aggressive deletes β€” see the Troubleshooting section of the API Developer Guide.

Iterate on surfaces and signals

  • Add richer event types over time if you launched with page_view and click only. deepest_scroll, article_save, and engaged_read materially improve personalization quality.
  • Expand to new surfaces from the User Experience Guide once an existing surface is healthy and instrumented.
  • Revisit content metadata coverage. Pages missing categories, tags, or author reduce what the model can reason about β€” especially for new content.
  • Schedule a recurring (quarterly is a reasonable default) review of recommendation quality with editorial and product stakeholders.

Token and access maintenance

  • Rotate Delivery API tokens on whatever cadence your security policy requires.
  • Remove tokens that are no longer in use via DELETE /v1/access/keys/{key_id}. The 30-token-per-environment limit is easy to hit if old tokens linger.

Where to go next