Many clients may come to the table providing data that is "rolled up" or "aggregated" already, which does not provide the event level data Faraday requests. While this type of data is holistically useful for business analysts and business leadership to understand trends, patterns and summary attributes about an individual, Faraday already has a automated system in place to do just this.

Example of rollup data:

The reason Faraday is asking for event-specific data is because our prediction modeling system is built on individuals entering certain cohorts, based on a date. Having only the first or last date of an event (as in the above screenshot) actually hinders our ability to model off of your customers.

Faraday will take in your data:

  • event-by-event, along with all other associated fields in the row

  • match these events to known identities using our algorithm

  • make assessments on meaningful, strong-signaled patterns present

Therefore, rollups are the way to represent the aggregation of a single field in an event stream, based on some window prior and relative to the reference date provided in its definition. These can be leveraged by cohort membership and joined directly to those individuals for enrichment to the outcome. More common examples of types of aggregations might be (but not limited to):


  • SUM

  • UNION (distinct values)

  • MAX

  • MIN

  • windowed DAYS FROM

which may be translated into things like:

  • COUNT of orders from day 1 to day 90

  • SUM of policy payments received from day 30 to day 60

  • UNION of distinct browser types viewing pages last 7 days

  • MAX value of investments from 284 to 365 days

  • MIN value of payment (all time)

  • DAYS FROM first event to last event

The result of a particular rollup is a single feature for a household (specifically, an "individual") in either the training or scoring data. These stand as important first-party data characteristics that can be used to tune and/or queue your model to providing a greater level of specificity on behaviors you may not have even know existed within the data!

Did this answer your question?