Content Recommendations Onboarding Checklist
A practical, phase-by-phase checklist for taking Content Recommendations from zero to live traffic without launching on a cold model.
About this checklist
This guide is a companion to the Content Recommendations API Developer Guide and the Content Recommendations User Experience Guide. The developer guide covers what the API does; this checklist covers when to do each step in a rollout.
The phases are sequential.
Pre-launch
The goal of this phase is to have a fully populated catalog, a meaningful behavioral signal, and a working integration.
Access and credentials
- Identify the team or service that will own the integration end-to-end.
- Generate a Delivery API token via Provisioning tokens through Delivery API.
- Store the token in your secrets manager. Never commit it to source control or embed it in client-side code.
- Confirm the token is assigned to the correct key collection for both the Collector and Recommendations APIs.
Identifiers and anonymization
- Decide on your
user_idscheme. It must be anonymized: never an email, name, or other PII. - Confirm the same
user_idwill be produced for the same person across sessions, devices (where possible), and between historical backfill and live traffic. - Decide on your
item_idscheme. Use a stable CMS ID or URL slug β not a full URL. Arc XP customers should use the contentβs ArcID. - Confirm the same
item_idwill be used in both content payloads and behavioral events. The model joins on this field. - Pick a dedicated test
site_idto use for any QA or load testing. Never test against your productionsite_id.
Content catalog
- Bulk-load every published item in your catalog via
POST /collector/v1/contentwithaction: "publish". - Include accurate
categories,tags,author, andtimestampon every item. Sparse metadata hurts cold-start quality for new content later. - Wire up the live CMS webhook (or the IFX Recipe for Arc XP CMS customers) so ongoing publishes, updates, and deletes stay in sync automatically.
- Verify your unpublish/delete pipeline sends
action: "delete"payloads. Stale catalog entries lead to recommendations pointing at dead pages.
Behavioral signal
- Replay 30β90 days of historical events through
POST /collector/v1/eventsif your CDP or analytics system retains them. - Preserve the original
timestampon each historical event β do not stamp them with the backfill time. - Send richer event types (
click,article_save,deepest_scroll,engaged_read) in addition topage_viewwhere you have them. - Filter internal employee traffic out of any event stream before sending. Editor and newsroom traffic skews the model away from real reader interest.
- Filter known bot and crawler traffic out of any event stream before sending.
Shadow-deploy the collector
- Instrument live event collection in your apps against the real production
site_idwhile recommendations remain hidden from users. - Let the collector run for a meaningful window β typically a few weeks β so the model trains on real traffic patterns alongside the historical backfill.
- Spot-check
GET /recommendations/v1/recommendationsfrom internal surfaces (QA builds, staging, internal dashboards) during this window.
Pre-launch validation
- Sample recommendations for a diverse set of
user_ids β new users, heavy readers, anonymous sessions β and manually review whether results look on-topic. - Confirm returned
item_ids resolve cleanly in your CMS and arenβt pointing at deleted, unpublished, or non-existent content. - Stress-test your hydration layer. The API returns ranked IDs; your front-end is responsible for fetching the renderable content.
- Decide on a rollout strategy: percentage of traffic, single surface, or single site first. Avoid flipping recommendations on everywhere at once.
Launch
This phase is the cutover itself. Keep it small, observable, and reversible.
Day-of cutover
- Enable recommendations on the chosen launch surface and traffic slice only.
- Confirm the front-end is sending
clickevents back to the Events endpoint when readers click a recommended item. This feedback loop drives quality improvement. - Watch error rates and latency on both the Collector and Recommendations APIs in real time for the first hour.
- Spot-check live recommendations for several real user sessions immediately after enabling.
Health checks during the rollout window
- Confirm
POST /collector/v1/eventstraffic volume matches expected page-view and click volume from your analytics. A large gap usually means an instrumentation bug. - Confirm
POST /collector/v1/contentis firing on every publish, update, and delete from your CMS. - Check that
recommendationsarray lengths look reasonable across users β empty arrays for many users typically indicate asite_idmismatch. - Keep a rollback path warm. If anything looks wrong, you should be able to disable the surface or reduce the traffic slice in minutes, not hours.
Expand the rollout
- Increase the traffic slice in stages once health checks are stable. Common ramps are 5% β 25% β 50% β 100%, with a hold period at each step.
- Add additional surfaces (article-page rails, homepage modules, email, etc.) one at a time. See the User Experience Guide for surface-specific guidance.
- Avoid launching multiple new surfaces in the same week unless you have separate observability for each.
Post-launch
Once recommendations are live for real readers, the work shifts from standing it up to keeping it healthy and making it better.
Ongoing data hygiene
- Monitor that the CMS β Content endpoint pipeline stays in sync. A silent webhook failure for even a day or two will bias recommendations toward stale content.
- Audit your
deletepayloads quarterly. Over-aggressive deletes silently shrink the recommendable catalog. - Re-confirm your bot and employee filters periodically. Both tend to drift as your traffic mix and internal tooling evolve.
- Re-validate
user_idstability. If session-scoped IDs leak in alongside authenticated IDs, the model canβt accumulate history per reader.
Quality monitoring
- Track click-through rate on recommendation surfaces against your pre-launch baseline. Surface-level CTR is the cleanest read on whether the model is earning its keep.
- Track hydration failure rate (recommended
item_ids that fail to resolve in the CMS). A nonzero number indicates a delete-pipeline gap. - Sample recommendations for a rotating set of
user_ids on a recurring cadence β the same review you did pre-launch, but now with real engagement data behind it. - Treat empty-recommendation responses as a signal to investigate, not a bug to suppress. Common causes are
site_idmismatches and over-aggressive deletes β see the Troubleshooting section of the API Developer Guide.
Iterate on surfaces and signals
- Add richer event types over time if you launched with
page_viewandclickonly.deepest_scroll,article_save, andengaged_readmaterially improve personalization quality. - Expand to new surfaces from the User Experience Guide once an existing surface is healthy and instrumented.
- Revisit content metadata coverage. Pages missing
categories,tags, orauthorreduce what the model can reason about β especially for new content. - Schedule a recurring (quarterly is a reasonable default) review of recommendation quality with editorial and product stakeholders.
Token and access maintenance
- Rotate Delivery API tokens on whatever cadence your security policy requires.
- Remove tokens that are no longer in use via
DELETE /v1/access/keys/{key_id}. The 30-token-per-environment limit is easy to hit if old tokens linger.
Where to go next
- Content Recommendations API Developer Guide β endpoint reference, authentication, field-level details, and troubleshooting.
- Content Recommendations User Experience Guide β surface-by-surface guidance for where to deploy recommendations and what each placement is good at.