Since Salesforce’s acquisition of Tableau, one of the most common questions we get asked is, “How can we embed Salesforce’s powerful point-and-click predictive modeling into Tableau’s best-in-class visualizations?” This blog is going to answer that question and walk you through the steps of how to embed your Einstein Discovery Prediction into Tableau. We’ll be creating a predictive What-If analysis visualization in Tableau, where we update variable values on the fly to get prediction scores, explanations to the score, and recommendations on how to improve the score.
To help illustrate the steps, we’re going to walk you through a real-world example. Let’s put ourselves in the shoes of a pharmaceutical company that is sponsoring a clinical trial that measures the effects of a drug on COVID-19 patients. Since the COVID-19 crisis has drastically affected how clinical trials are run and withdrawal rates have increased, we want to predict the likelihood of an enrolled patient to drop out of the clinical trial. As we’re not doing in-person check-ins with patients anymore, so we want to be able to get withdrawal likelihood updates as we’re meeting patients virtually.
To embed Einstein Predictions into Tableau, we need 2 things:
Use Cases for Embedding Einstein Predictions into Tableau
Before we dive into the step-by-step process of embedding an Einstein Prediction into Tableau, we first need to answer the “Why?” Why should I embed the prediction into Tableau, rather than just stay within Salesforce?
The tool in which you embed the Einstein Prediction depends on where the data lives and where the users live. If the data is external to Salesforce, or the users live outside of Salesforce, it is better to embed the predictions in Tableau. On the other hand, if the data is in Salesforce and the users have Salesforce licenses, embed the predictions in Salesforce, since users will be able to take action from the prediction directly in Salesforce. If you’ve got further questions on where you should embed your prediction, use our flowchart from a previous blog.
Let’s visit our clinical trials use case to see where we should embed our prediction. We use Salesforce’s Sales, Service, and Health Clouds, but data for the clinical trials we’re sponsoring is in an external system. Additionally, clinical trial technicians do not have Salesforce licenses. Therefore, we’re going to embed the prediction into Tableau.
Getting the Training Dataset into Einstein Discovery
Now that we’ve decided on which tool to embed the predictive model into, we need to get the training dataset into Einstein. The training dataset will be used by Einstein to train your model on how to make predictions. Since we’re working with external data, we can manually upload a CSV file extracted from the external system, make a direct connection to a cloud-based data store (such as Snowflake or BigQuery) using Einstein’s external connectors, or use an ETL tool such as Mulesoft to push the data into Einstein. Once we have our training dataset in Einstein, we can then create the predictive model, called a Story in Einstein Discovery.
Let’s think about this through the eyes of our clinical trial use case. We’ve captured data across hundreds of clinical trials since 2017 that identifies whether patients stayed enrolled throughout the duration of the clinical trial or withdrew early, as well as independent variables such as how far the patient lived from the clinical trial site and whether they missed any doctor’s checkpoints. We’ve got this dataset as a CSV, and load it into Einstein Analytics.
Creating the Einstein Discovery Story
After creating the training dataset, create the Einstein Discovery Story that will expose relevant facts, themes, and statistical correlations in your data. We need to set the story Story Goal by maximizing or minimizing a boolean (true/false) or numeric variable. We have the power to manually select the explanatory variables you want to include in your story, but Einstein can also pick the most relevant factors for you automatically.
Going back to our clinical trials use case, we set the Story Goal to Status = Withdrawn, which is identified via a dimension that simply states whether the patient is enrolled or withdrawn.
Checking Model Metrics
After creating the Einstein Discovery Story, we can navigate it to gain insights that reveal descriptive to prescriptive insights. Since we’re going to be embedding the predictive model into Tableau, we won’t be spending much time on this (refer to Help article for more details). At this stage, we need to check Model Metrics to evaluate the model’s ability to predict an outcome. There are two components we need to check in Model Metrics: Data Validation and Model Performance.
Data Validation detects possible improvements in your data during validation. For example, it detects for instances of Duplicates, Outliers, Bias, and Recommended Buckets. We can accept any improvements we want to implement in our story, which can help improve the performance of the model.
Model Performance gives insight into the performance of the model, showing statistical metrics such as AUC, R^2, and a confusion matrix. If we are happy with the outcome of these metrics, we can move forward with embedding the model in Tableau. If not, we’ll need to make updates to the model, such as adding or removing certain explanatory variables.
Revisiting our clinical trials use case, the only recommended update we get is to change the buckets for the field Platelet Count. We specifically broke Platelet Count into unsafe level and safe level numeric buckets, so we decline the recommendation. The Model Performance tells us the AUC is 82.8, which is a satisfactory level of correct classification for us. We’re ready to move to the next step!
Deploying the Model
The last step in Salesforce is to deploy the model. Deploying the model will allow you to embed it in Tableau. We can deploy the model directly from the Einstein Discovery Story. Make sure to take note of the model name, as you will use this in a later step. At the step it asks you how you want to deploy the model, select to deploy without connecting to a Salesforce object, since the data doesn’t live in Salesforce. At the next step, select your actionable variables, because this will drive the recommendations for improvements the embedded model delivers.
For our clinical trials use case, we set our actionable variables as Distance from Clinic and Platelet Count. In today’s times, we have the power to move a patient that lives far away from the clinic to a virtual check-in. Also, if a patient has unsafe platelet counts, we have diets that we can put the patients on.
Embedding into Tableau
We’re now ready to embed the model into Tableau, which we can achieve by loading a Tableau extension into the Visualization. John Hegele, a Solution Engineer at Tableau, has built a point-and-click Extension, which you can download as a .trex file here and save to our desktop (to learn more about the extension file, visit John’s detailed explanation). Before loading the extension into Tableau, we need to add Parameters to the Visualization that match the explanatory variables in the Einstein Discovery model. Open a new Worksheet, and add Parameters for all of the explanatory variables in the predictive model. Make sure the Data Type matches the data type in Einstein.
After adding Parameters, we can add the Extension you saved as a .trex file on your desktop. We must authenticate to Salesforce. After authenticating, select the Einstein Prediction Definition from when you deployed the model in a previous step to create the What-If analysis on. Next, match the Einstein Model Parameters to the Tableau Parameters.
Next, add filters to the visualization. Filters will allow us to slice and dice on the prediction to get updated scores, explanations, and actions based on what is selected. This is incredibly powerful because we can see how changes to explanatory variables affect the prediction. If we see a low predicted score, we can see prescriptive actions that, if followed through on, can help improve the outcome.
To add filters, add the Parameters sheet from an earlier step to the visualization. Click on the dropdown next to the sheet, then Parameters, then add all of your parameters. Delete the Sheet once finished.
We can now use the filters to update our prediction on the fly. Use the Explanation tab to get insights into why the score recorded at its value, and the Action tab to get recommendations on how to improve the predicted score. The recommendations are shown tie to the actionable variables you selected in Deploying the Model step.
As the last step, improve the look and feel of the visualization to make it more usable. We can give the prediction card background color, add a title and header section to the visualization, add title text to the filters to make it clear what their intention is, and change the size of the visualization. After completing this last step, we’re ready to roll the visualization out to users!
Returning to our clinical trial use case, we’ve finished the setup steps and rolled the visualization out to our clinical trial technicians. Since the patients have moved to virtual check-ins, the technicians love using this on the fly when they’re meeting virtually with patients to see what their withdrawal likelihood is and gain insight into how they can minimize the withdrawal likelihood.
Art of the Possible
By embedding the What-If Analysis into Tableau, we’re able to make updates to explanatory variables on the fly to get updated prediction scores. This is incredibly useful during the analysis of different possible scenarios, and in today’s climate can be used live during virtual meetings. As a next step, we may want to score individual records in a Tableau visualization. For example, if we’re bringing data into Tableau from an external source, we want to score those records every time the data is refreshed. That is also possible with the Einstein Prediction Services API!
Returning to our clinical trials use case for the last time, we bring in data about our patients into Tableau, with the data being refreshed every week. We want to score each individual patient and update the score every time the data is refreshed.