Developers guide to setting up a Paywall

Arc XP Subscriptions includes a self-service Paywall that allows you to easily manage how content your readers can consume before they need to register or subscribe.

Your team can build Rules that target specific content and audience criteria, such as “Readers consuming sports galleries on their phone or tablet”, and set a content budget, such as “The reader gets 8 unique pieces of content before seeing our sports photography-focused paywall.” You can bundle multiple rules together as a Ruleset and deploy them as a Paywall for a specific timeframe.

The paywall is deployed as a JavaScript bundle that’s included on every page you want the paywall to evaluate. It isn’t a requirement that PageBuilder delivers this page.

The client-side p.js evaluator considers all available facts, such as the reader’s article consumption history and properties of the current story, to determine if any of the rules are currently exhausted. The paywall script also updates the content consumption audit trail persisted to the browser’s local storage. If any of the rules are exhausted, the evaluator will return the first exhausted rule. It’s up you to design and build what happens next — Arc XP Subscription paywall simply returns a summation based on the facts at hand and you have complete control over the rest of the user experience.

Getting started

To quickly get a paywall deployed you will need two things:

The URL where your paywall lives (usually something like https://Your.Domain.Com/Arc/Subs/P.Js. This asset will be served by Akamai.
A loading script which asynchronously loads the above script (or a <script async> tag), configures it and runs it.

With respect to content, you must provide a few things that the paywall script will need. Since this script relies on data found on the page, you will need to configure it in such a way that it can find what it needs. Most things are automatic, but things such as the content’s type (i.e. ‘story’, ‘gallery’, or ‘video’) and the section (e.g. ‘business’, ‘sports’, ‘finance’) need to be provided at configuration time.

To include this script in your PageBuilder instance, you have a few options:

Set up an object called window.ArcPOptions to feed to the ArcP.run() function and then simply load the script (good for async loading).
Load the script using a synchronous <script> tag, and run ArcP.run(options) manually.
Load the script asynchronously, check for the existence of window.ArcP (or use an onload attribute to determine when it’s ready), and run ArcP.run(options) manually.

The benefit of using option 1 is that you can load the script at any point in your page-load sequence. Once it’s loaded, if the evaluator detects a variable window.ArcPOptions, it will execute immediately by passing window.ArcPOptions to ArcP.run(). This is the best option to avoid impacting your page load times. Although the paywall script is relatively small (< 10 KB gzipped), some people will prefer bundling the script along with their other javascript to make just 1 single HTTP request for everything.

The benefit of using option 2 is that you have full control of when you want to run the paywall script. This works best if you need to wait until your data is loaded and ready before evaluating the rules Furthermore, this will keep things simple with regards to determining when the ArcP library is ready to .run().

Option 3 is a bit like option 2, but it may be a bit more tricky to determine when the script is ready, depending on the browser. You can use an onload attribute, a setInterval call to check for the existence of window.ArcP and then call ArcP.run(options) manually.

Here’s an example of how you would load the paywall script on www.washingtonpost.com (you can also minify this a bit to save a few bytes).

!(function() {
    var page = document.getElementById('page');

    // configure the paywall's options by passing it
    window.ArcPOptions = {
        paywallFunction: function(campaign) {
            console.log('arc paywall fired!', campaign);
        },
        contentType: page.getAttribute('data-content-type'),
        section: page.getAttribute('data-section'),
        userName: '<SomeUUID or emailAddress>',
        jwt: "<some JSON Web Token>",
        resultsCallback: function(results) {
            console.dir(results);
        }
    };

    // Asynchronously load the script
    var doc = document;
    var script = doc.createElement('script');
    script.src = "https://your.domain.com/arc/subs/p.js";
    script.async = "true";
    doc.body.appendChild(script);
}());

In this case, the window.ArcOptions will be set up before the script loads, which means that the script will auto-run as soon as it loads. By setting the resultsCallback option, you will be able to see some diagnostics information that gets returned each time the script is run. This will contain useful information such as the time it took to evaluate the rules, the rules that triggered the paywall (if any) and other information, as seen below:

{
  "triggered": { // an object showing the rule id that popped the paywall and how many times it's evaluated to true.
    "id": 2,
    "rc": 5
  },
  "timeTaken": 100, // time in ms it took to evaluate the rules
  "paused": true | false, // true if paywall is paused
  "visited": true | false, // true if evaluator has already been run on this content
  "ruleCount": 10, // number of rules evaluated
  "private": true | false // if the browser is determined to be in private mode
}

Another way to get this data is by manually running ArcP.Run(options) which returns a Promise resolving to the same data as above. Here’s a different way you could load and run the paywall script:

<script>
    function runPaywall() {
        ArcP.run({
            paywallFunction: () => alert('paywalled!'),
            customPageData: () => ({
                c: 'story',
                s: 'business',
                ci: 'https://www.your.domain.com/canonical/url'
            })
        }).then(results => console.log('Results from running paywall script: ', results))
          .catch(console.error);
    }
</script>
<script>https://your.domain.com/arc/subs/p.js" async onload="runPaywall()"></script>

In the case above, we are assuming that Promises are available (via polyfill or built-in). We are loading the script asynchronously and then manually running ArcP.run() once the script loads up. This is certainly a cleaner way which gives you a bit more control, but how you load it depends on your set up and the browsers you intend to support. The goal of the paywall-evaluator script is to provide flexibility so that it can fit with your current setup.

Caveats

Since this paywall script uses localStorage to store state, it is important to note that it won’t work as expected across subdomains. Work is currently underway to provide a way to address this limitation.
To keep the payload small in localStorage, we will clear any read counts which are not associated to the current ruleset. This means that if you remove a rule from the ruleset, that read count will be removed.

Overview

At a very high level, the paywall script is loaded and executed like this

Paywall High-Level View

Below you will find details of each step in the above diagram and how each one of them will accomplish what it needs to.

User lands on page

This step is self-explanatory, however we must determine how and where the script will be loaded. There are a number of assumptions:

The script will run in a PageBuilder rendered website.
The script will be loaded asynchronously with a simple script tag in the global header or footer of the page.
The global header or footer must provide 3 pieces of data: Content’s section (e.g. business, finance, politics, sports, etc…) and the Content’s type (e.g. video, gallery, article, etc…) and a canonical URL.

Fetch Ruleset + Evaluation script

To prevent multiple HTTP requests, the ruleset is bundled with with the evaluation script in one file. To be clear, the structure of the file would be as follows:

// small ruleset
var ruleset=[{"c":[true,"article"],"s":[true,"business","finance"],"rt":2,"i": [false, "8.8.8.8"], "pw":"newsgazette.com/offers/finance"}];

// evaluation script (for illustration purposes only)
function run(paywallCallback) {
 if (typeof paywallCallback === 'function') {
   this.paywallCallback = paywallCallback;
 } else {
   console.error(`${LOG_PREFIX} No callback defined -- bailing out`)
 }

 this.facts = getSession();
 const start = performance.now();
 evaluateRuleset(ruleset, getSession(), paywallCallback);
// etc...

Evaluation script initializes

The paywall script will expose a variable on the browser’s window object called ArcP. To access its methods you will need to do something like this:

     ArcP.run(() => alert('paywalled!')));

The run() method will fire off a few different processes that will asynchronously take care of fetching all necessary data and then determine whether the paywall should be triggered or not. In the example above, the paywall is just an alert, but a client developer can customize this based on their desired user experience.

For the most part, this is the only way you will interact with the paywall script as a client developer. You can also reset the paywall’s state (e.g. when a user logs out, purchases a subscription or some other event) by calling the ArcP.reset() function.

Fetching necessary data

The data needed to determine if a reader is shown a paywall or not is divided into three main parts:

Entitlement data
Page data
Browser data

Below is a detailed description of how each part is retrieved.

Entitlement data

Entitlement is defined by a couple pieces of data: subscription status and registrations status.

A call to the entitlement API (or an execution of customSubCheck) will happen whenever the following criteria is met:

The user is logged in.
A rule evaluates to true (i.e. the paywall is about to be executed).

The fetched entitlements are stored in localStorage and refreshed after 24 hours.

A registration check (or an execution of customRegCheck) will happen run each time ArcP.run() is executed and a jwt token is not passed in. Seeing a jwt token passed as an option is telling the paywall that this user is logged in.

Page data

As mentioned before, the page will have to deliver three pieces of data necessary for the paywall script to make a decision:

The Content’s section.
The Content’s type.
The page’s canonical URL or content ID.

This will need to be retrieved each time the script runs. The page’s unique identifier (either a canonical URL or a content ID) will be stored in localStorage, or cookies if localStorage is not available, so that we don’t count it as a read again.

Browser data

This data consists of everything that can be determined by looking at the reader’s browser, namely:

referrer
userAgent

The userAgent will be parsed and the deviceClass will be determined from it. This deviceClass will be stored in localStorage indefinitely and reused each time, so this operation will only happen once per browser. deviceClass is currently determined by the rules below:

if the UserAgent is determined to be that of a mobile device then:
- if the window’s width is less than or equal 640px, it’s considered mobile
- if the window’s width is larger than 640px, it will be considered a tablet
otherwise it’s a desktop

Evaluate the ruleset against the gathered data

For the purposes of this document, the gathered data will be called the facts. The ruleset will be run through an evaluation script which will be detailed below.

Rulesets are a set of rules.
Rules are a set of conditions.
Conditions are compared to gathered facts to determine if the paywall should be triggered or not.

Read Counts

Once all facts have been fetched and a rule is found to evaluate to true, the appropriate read count will be incremented.

const readCounts = { // read counts data
   "1": { // rule id
      "c": 7 // i.e. ruleId: 1 evaluated to true 7 times.,
      "lastUpdated": 1536929746868 // last time the read counts were updated
    },
    "2": {
      "c": 3, // i.e. ruleId: 2 evaluated to true 3 times.
      "lastUpdated": 1536929746868 // last time the read counts were updated
     }
     ... etc.
   }

 };

Read counts will only be added to a rule if the entirety of the rule’s conditions evaluates to true. The read count data will be stored in localStorage, or cookies if localStorage is not available.

Rule Budget Types

Calendar-based

There are two calendar based options. If the option selected is monthly, the reader’s history will be reset on the first of the month. If weekly, the reader’s history will be reset on a specific day of the week. The day of the week can be selected when creating the rule and setting the budget.

Rolling

If a rule has a budget type of rolling, the total read count will be based on the reader’s history within the rolling window. The rolling window is the number of rolling days selected for that particular rule. For any given day, the total read count for the user will be based on that day and the days prior that fit within the rolling window period. Users will get read counts back each day as the consumption at the beginning of the rolling period falls out of the window.

Rolling rule counts are tracked by the rule id. The array contains objects that correspond to each day within the rolling window with date and count fields to track consumption. The length of the array will always be <= the number of rolling days. Each time a user hits an article with a rolling rule, we’ll clear expired days, and either add a new day or increment the count of the existing day depending on time of visit. The total rule count, rc[ruleId].c, is determined by combining all count fields in the array for that id.

“rc”:
  {
    "1":
      {
        "r":
          [
            { "date": 1552373321234, "count": 1 },
            { "date": 1552459724296, "count": 1 }
          ],
        "c": 2,
        "lastUpdated": 1552459724296
      }
  }

Ex: A total budget of 10 articles over a rolling period of three days.

Note: Total Read Count and Budget Remaining are calculated at the end of the day in chart below.

Day	Daily Read Count	Total Read Count	Budget Remaining
Monday	2	2	8
Tuesday	8	10	0
Wednesday	0	10	0
Thursday	1	9	1

On Tuesday after the 8th article or on Wednesday at any time, if a user attempted to read an article they would hit the paywall.
On Thursday, the reader gets 2 articles back because Monday falls out of the rolling window.

Example

You can see all this in action by visiting this URL: https://codepen.io/merwan7/pen/EdbRag.

API Documentation

All methods described here can be executed by prefixing the function with “ArcP.”. Once the script is downloaded, the ArcP object can be found in the global (window) namespace. If the script detects an object window.ArcPOptions exists, it will immediately run as soon as the code is downloaded.

Methods

run(options)

Runs paywall script by fetching facts from various sources and then evaluating the ruleset against those facts. If one of the rules is over the threshold of allowed reads, the paywallFunction parameter will be executed by passing it the campaign url or code defined in the rule as its only argument.

Parameters

options - An options object (see more below)

Returns a promise with some useful data (e.g. for reporting or analytics purposes) if successful, or Promise.reject() if there’s an error. An example of returned data is shown below:

{
  "triggered": { // an object showing the rule id that popped the paywall and how many times it's evaluated to true.
    "id": 2, "rc": 5
  },
  "timeTaken": 100, // time in ms it took to evaluate the rules
  "paused": true | false, // true if paywall is paused
  "visited": true | false, // true if evaluator has already been run on this content
  "ruleCount": 10, // number of rules evaluated
  "private": true | false // if the browser is determined to be in private mode
}

The following options are available to pass as the second argument. All these options can be either passed directly to ArcP.run(options) or saved in an object called window.ArcPOptions which will be used as the options as soon as the script is loaded.

paywallFunction - (required) - string - Function to call to show paywall. Will be passed the campaign url or code associated with the rule.
contentType - (required) - string - The content type of the current page (story, video, gallery). This is not required if you use a customPageData function.
section - (required) - string - The section where current page belongs (e.g. politics, business, sports/nfl, sports/nhl, etc.). This is not required if you use a customPageData function.
contentRestriction - (required) - string - The picklist where the current page belongs. (e.g. always_free, subscriber_only, Premium, etc.). contentRestriction is populated with these values set in the composer picklist. This is not required if you use a customPageData function.
userName - (optional) - string - An identifier for the current user. If none is passed, data will be shared by all users who may use the same browser. This value can be any unique identifier for the user in question (email, uuid, username, etc.). When no userName is passed, the id used will be “anonymous”.
jwt - (optional) - string - A valid (non-expired) JWT associated to the currently logged in user. If none is passed and the customRegCheck function is undefined, the user will be deemed as logged out. If both customSubCheck AND jwt are undefined, the evaluator will assume that the JWT can be found in localStorage[‘ArcId.USER_INFO’].
apiOrigin (optional) - string - This is the base domain used to query the entitlement service. e.g. “profile.website.com”, if none is passed, the api will be called as a relative path to the current URL.
customSubCheck - (optional) - function - a function used to check subscription status. Should return a Promise resolving to an object like this:
Identity - (optional) - object - passing the Identity SDK provides p.js with access to the userName field (when options.userName not passed) if Identity is not available on the window. If the Identity parameter is not provided, window.Identity will be used as the fallback.

{
  "s": Boolean, // is the user subscribed (required)
  "p": [String, String, ...], // array of sku names (required)
  "timeTaken": Number, // time in milliseconds it took to fetch this info (optional)
  "updated": Number // Date in milliseconds from epoch (e.g. Date.now())
}

If none is passed, a default one will be used instead. The default check will require the jwt token, so the token itself either needs to be passed in as an option or needs to be stored in localStorage.ArcId.USER_INFO, which is considered the default location where the JWT payload is stored.

customRegCheck - (optional) - function - a function used to check registration status. Should return a Promise resolving to an object like this:

{
  "l": Boolean, // is the user logged in? (required)
  "timeTaken": Number, // time in milliseconds it took to fetch this info (optional)
  "updated": Number // time in milliseconds when this was last updated
}

If none is passed and any token is passed in through the jwt option is passed in, the user will be seen as logged in.

customPageData - (optional) - function - a function used to retrieve page data. Using this function will have precedence over using the contentType and section options above. Should return an object like the one below:

{
  "s": String, // section (required)
  "c": String, // Content Type (required)
  "cr": String, // content restriction status (optional)
  "ci": (String | Number) // content identifier (required)
}

If none is passed, a default one will be used instead.

longTermStore - (optional) - class - a class that implements getItem, setItem and removeItem (all of which return a promise), the key used to store things will be “ArcP”.

reset(longTermStore)

Resets the long term data stored. If no long term store is passed in, it will default to clearing localStorage. This function can be useful if no userName is passed into to run and you need to clear the paywall data after a user logs out.

longTermStore - (optional) - class - a class that implements getItem, setItem and removeItem (all of which return a promise), the key used to store things will be “ArcP”.

Properties

_version

Returns the version number of the evaluator

_rules

Returns the array of rules being evaluated. These rules are compressed and slightly obfuscated so are a bit hard to read, but this property is here to help debug if needed.

For example:

[
  {
    "c": [
      true,
      "story"
    ],
    "d": [
      true,
      "desktop"
    ],
    "pw": "https://subscribe.washingtonpost.com/acqlite",
    "e": [false],
    "ent": [true, 123456],
    "rt": [
      ">",
      10
    ],
    "user_segment": [true, "high"],
    "id": 141
  }
]

In terms of the entitlement that will bypass the rule’s criteria:
e: [true] ent: [false] means the rule will be bypassed by a registered user
e: [true, 'SKU1'] ent: [false] means the rule will be bypassed by a user with “SKU1”
e: [false] ent: [true, 123] means the rule will be bypassed by a user with the entitlement 123

_facts

Returns all the facts gathered by the paywall evaluation script. Here’s an example of what that might look like:

{
  "d": "desktop", // device class
  "r": "localhost", // referrer
  "pm": false, // private mode
  "s": "business", // section
  "c": "article", // content type
  "ci": 1537881520420, // content identifier
  "rc": { // read count data
    "11": { // ruleId:  <times rule was found to be true>
      "c": 2, // "counts",
      "lastUpdated": 1537881520438 // last time read counts were updated
    },
    "12": {
      "c": 1,
      "lastUpdated": 1537881520438 // last time read counts were updated
    }
   }
  },
  "reg": { // registration info gathered
    "l": false, // is user logged in
    "timeTaken": 0, // time it took to gather info
    "updated": 1537881520419 // last time this was updated
  },
  "ent":{ // subscription info gathered, it in terms of Entitlements
    "entitlementsFailed": false, // Indicates if the call to /sales/public/v2/subscriptions/entitlements failed
    "e": [123, 456], // array of EntitlementIDs
    "timeTaken": 0, // time it took to gather this info
    "updated": 1537881518696 // last time /sales/public/v2/subscriptions/entitlements was called
  },
  "sub": { // subscription info gathered, it in terms of SKUs
    "entitlementsFailed": false, // Indicates if the call to /sales/public/v1/entitlements failed
    "p": ["Basic Digital"], // array of SKUs
    "timeTaken": 0, // time it took to gather this info
    "updated": 1537881518696 // last time /sales/public/v1/entitlements was called
  },
  "userSegments": { // user segment information in which the user is falling. This info is injected to the user profile by calling the developer API PUT /identity/api/v2/identity/segmentation/{userID}
    "segments": {
      "id": 4634410301441521,
      "segment": "high"
    },
    "updated": 1702320804547 // last time /identity/public/v2/identity/segmentation was called
  },
  "v": [ // array of content identifiers where the paywall script has run
    "canonicalURL", // these URLs are fetched from `link[rel="canonical"]` by default
    "canonicalURL"
  ]
}

For Audience criteria’s rules based on user segments, if you define IN rules, it will not take effect on anonymous users, since anonymous users do not fall on any segment. If you define NOT IN rules it will apply to all users.

AMP Pages

The Arc Paywall can be run on AMP pages. This Guide has the set up steps for configuring and using and AMP page paywall.