Companion table to Predictive modeling - FAQ.

Term

Definition

Training set/data

Data that was used to build the predictive model as well as illustrating the features that will be used to make its predictions.

Positive class

Examples you want more of when building a model

Negative class

Examples you want less of when building a model

Lift

Performance, usually expressed as to how much better a predicted result will be than a random result

Baseline

The general population of the eligible

ROC Curve

A measure of how frequently a classifier predicts false positives

Features of importance

These represent a measure of how much of an impact a particular feature made on the model’s predictions: highly important features are used more often and separate the training set more, while unimportant features are used infrequently.

Candidate pool

A group of individuals who have the necessary criteria to achieve an outcome

Example: Homeownership is a criterion required to consider purchasing rooftop solar

Eligibility

The eligible audience represents the effective marketable population

Complex model

When first-party data is used in the decision process for models, not just training but applying first-party data

Geonormalization

A process used to adjust for geographic bias in training data.

First-party data

Your customer data

Outcome

A quantifiable business goal that can be represented by a group of people who have achieved it

Example: You want to find more customers

Strategy

The way we get to achieve the outcome

Example: You want to find more customers, so we create a strategy that trains a model to look at historical examples of when your customers became customers

Model

A method of ranking every eligible individual from most to least likely to convert

Example: the output of the strategy that supports the overall outcome of finding more customers, the models scores those to identify those who are likely to become a customer

Did this answer your question?