My Einstein Discovery Toolkit: 3 Tips to get started on Einstein Discovery

Note: must have SF login credentials to access links within this blog

Do you want to create your first Einstein Discovery predictive model? Having trouble getting started? You’re in luck! We’ve created a Toolkit that will equip you with the skills needed to create your first predictive model with Einstein Discovery. This kit comes equipped with three tools, which are available in downloadable format here. To use the Toolkit, visit the link and create a copy of the Slides for yourself. This Toolkit will guide you through the steps of creating your predictive model, from the first step of figuring out what you’ll predict through the last step of deploying your model.

Before we go any further, we need to set the stage for what this toolkit should be used for. This toolkit is intended for a beginner audience with an introductory knowledge of Einstein Discovery and predictive analytics; check out this Trailhead module if you need help getting up to speed!. It should be used when deploying a predictive model to a Salesforce object. This certainly isn’t a requirement with Einstein Discovery, but is a good place to get started with Einstein Discovery.

As an example of how to use the Toolkit, we’ll put ourselves in the shoes of Cloudy’s Computing Co., a SaaS company that sells licenses on a per user per month basis. We’ll go through the journey of how Cloudy uses the Toolkit to create its first predictive model.

Tool 1: Prediction Storybook

The first step in creating your predictive model is to determine what you’ll predict. Before thinking about any of the technical components, we need to think about what problems our business is facing and how predictive analytics can help. Our first tool, the Prediction Storybook, will help with that. Similar to a Mad Libs book you may remember from your youth, this Storybook gives you complete control over your predictive model. All you need to do is fill in the blanks!

Einstein tool kit 1

These are the key concepts the Storybook covers:

  • Set Persona: Who will use the prediction? What is their role at your company?
  • Identify Pain Point: What is a problem your business is facing? What pain point is it causing? All predictions should overcome a pain point, so make sure you identify a pain point it can overcome before continuing
  • Set Prediction: This is what the Einstein Discovery model is going to predict.
  • Benefit Realized: This is the benefit the prediction will bring to your company. This will overcome the pain point identified earlier
  • Action Statement: All predictions should drive action to be taken. Consider what your co-workers would do if they were armed with the knowledge of the prediction. The action statement will help set the automation in a later step of this toolkit.

Let’s take a look at Cloudy’s Storybook; recall that they are a SaaS company that sells licenses on a per user per month basis.

Einstein tool kit 2

Revisiting our key concepts, let’s look at how Cloudy used the Storybook to set up their prediction.

  • Set Persona: Sales Reps will use prediction
  • Identify Pain Point: Customers are churning, leading to lost revenue
  • Set Prediction: The prediction will measure churn likelihood
  • Action Statement: By knowing a customer’s likelihood to churn, Cloudy will be able to take action by offering discounts to customers at high risk of Churn.
  • Benefit Realized: Through customer-saving actions like offering discounts, Cloudy’s customer churn will decrease, ultimately increasing revenue

By completing the Storybook, Cloudy has learned they will use Einstein Discovery to predict Churn Likelihood. Now, Cloudy is ready to use the next tool, the Building Blocks, which will help them set up the predictive model in Einstein Discovery.

Tool 2: Building Blocks

After deciding what you will predict, your next step is to create your training dataset. Our next tool, the Building Blocks, will help you determine the structure of your training dataset. This tool has five questions you need to answer to determine how your training dataset is built.

Building Blocks

These are the concepts that the Building Block covers:

  • Object you’re predicting on: This is copied from ‘Salesforce Object’ in the Prediction Storybook. This is what the grain level (otherwise known as lowest level of detail) of your dataset will be
  • Definition of prediction variable: Using the predicted measure you identified in the first tool, define it in the dataset. For linear regression models, this is the name of your measure, while for logistic regression models, it’s how you determine whether something is true or false.
  • Data sources to include: Tip: Try to think of 5 variables you think will have a high impact on the prediction. For example, that the days to renew probably have a high impact on churn. Then, think of the data source for these variables. These data sources need to be included in your training. Remember that you can always add additional data sources in future iterations of the model, so don’t overdo the number of data sources in the initial iteration.
  • Segmentation: Segment your dataset if there are characteristics within the global dataset that act completely differently, or don’t apply to the prediction. For example, in a global dataset of all Accounts, you wouldn’t want to include Accounts that are vendors, and not customers, in your training dataset.
  • Date periods to include: Determines how many years of data to include. This could potentially vary by data source.

Let’s take a look at how Cloudy completes the Building Blocks:

Building Blocks next

Let’s summarize what Cloudy has learned about their training dataset. The dataset will be account-based, spanning 3 data sources. Data will date back to 2019, and only include Customer accounts. Cloudy will take this information to create their training dataset. Check out these resources if you need help with creating your training dataset.

Tool 3: Rocket Ship

By this stage, you’ve created your training dataset and are ready to create your Einstein Discovery Story (predictive model). Check out this Trailhead module if you need help with that step. After you’ve created the Story, the next major step is to deploy the model. Although Stories have many useful and insightful components (outlined here), the objective of this toolkit is to guide you towards deploying a model back to Salesforce, so this blog won’t cover Stories in further detail. Our final tool, the Rocket Ship, will launch you towards model deployment.

Before letting your model launch towards deployment, there are two hurdles we need to clear; we need to ensure the model is accurate enough for deployment, and that the model has actionable outputs that are intuitive to the user.

Rocket Ship

Part I: Model Accuracy:

Simply fill in the model accuracy (found in the Model section of the Einstein Discovery Story). Although there is no definitive rule regarding the model accuracy needed for deployment, as a general rule of thumb, a score between 75 and 95 is within the threshold to deploy your model, whereas a score between 65 and 75 may be good enough for deployment; it will ultimately be up to you to decide whether your company is comfortable with deploying a model at its stated accuracy. Anything outside of this range should probably not be deployed. Be aware that models with accuracy scores greater than 95 are likely overfit, as they are too closely aligned with the outcome; for example, including the Opportunity Stage in a model that tries to predict Win Likelihood.

If your accuracy isn’t within the green, don’t fret; models often take multiple iterations to get to a point where they can be deployed. Revisit the second tool, the Building Blocks, and consider adding an additional data source, adding or changing the segment, and changing the date period within the training dataset. Changes like these can help improve your model accuracy.

Part II: Actionable Variables Table:

You’ve reached the final countdown before the launch to deployment! The last step is to identify three actionable variables and state an action a user can take to improve the outcome of these variables. Although it isn’t required to select actionable variables prior to deployment, predictive models provide the greatest benefit when they guide users to smarter decision making. We recommend choosing the three actionable variables with the highest correlation.

For each actionable variable, state an action a user can take to improve it. State this in a manner that is intuitive to your users, and will guide them to the steps they need to take to follow through on the action. This exercise serves two purposes; first, it can help set the translation of the improvement that is exposed to your end-users. By default, the improvement displayed to the user may not make it clear what they need to do to follow through on the recommendation, so using custom text can steer them in a clearer direction. Second, you can use it to set up automation in your org.

Rocket Ship last

Here is how Cloudy completes the Rocket Ship tool. Their model accuracy is 83% percent, within the range to move onto the next step. Next, they identify the three actionable variables with the highest correlation. They state actions a user can take to improve upon each of these 3 variables. They use customized translations to guide users on the steps they need to take to improve on the outcome, and are even starting to roll out automation.

Congratulations! You’ve just launched your first model to deployment. By using our toolkit, you’ve created a model that is accurate, actionable, and provides a clear benefit to your users. The best part is that you can revisit our toolkit to do things like add additional data sources, create segmented models, build additional automation, or even develop your second predictive model.