How to Add Predictions to Datasets using Tableau Prep Builder?
Happy post-Tableau Conference(ish), everyone! One of the most exciting announcements at the conference was the upcoming AI Interoperability of Einstein Discovery in Tableau. In the short time since the conference, we’ve already got lots of questions from our customers about how to achieve this. While this won’t be part of Tableau officially until next year, the good news is that we’ve already got some solutions for you to be able to achieve this.
In Part II of this blog series, we covered how to use a Dashboard Extension to generate powerful AI-driven What-If Analysis Visualizations, powered by Einstein. This blog post will cover how to use Tableau Prep to add predictions to your datasets. We can use this tool to deliver AI-powered insights for Tableau Visualizations we’re powering with Prep Builder.
Please note that the solutions we cover in our blog series are not officially associated with Tableau or Salesforce, but can be used to gain the powerful AI capabilities of Einstein in Tableau using point-and-click.
To automate the scoring of predicted records, we need 5 things:
- An Einstein Analytics Plus or Einstein Predictions license from Salesforce. If your company org doesn’t have this, download an analytics-enabled Dev Org.
- Tableau Prep Builder. Sign up for a free trial.
- Python. Python is a programming language that lets you work quickly and integrate systems more effectively. Note that you don’t need knowledge of Python to implement this solution.
- TabPy. TabPy (the Tableau Python Server) is an Analytics Extension implementation which expands Tableau’s capabilities for users to execute Python scripts and saved functions via Tableau’s table calculations.
- AIOHTTP Python library. Asynchronous HTTP Client/Server for Python.
Building and Deploying the Model in Salesforce Einstein
To automate the scoring of predicted records in Tableau, we’ll start by completing some tasks in Salesforce. We went over these in our prior blog in detail, and won’t repeat that in this blog, but please refer back to the prior blog for step-by-step instructions. As a refresher, the tasks we’ll need to complete are.
- Get the training dataset into Einstein Discovery.
- Create the Einstein Discovery Story
- Check Model Metrics
- Deploy the Story
Creating Connected App
There is one additional step we need to complete in Salesforce that is unique to this process; create a Connected App. The Connected App is going to give us a Consumer Key and Password, which we’re going to need in a future step.
Create a Connected App by navigating to App Manager from Setup, then clicking New Connected App. Most of the settings can be left at their default, but there’s a couple of important things to note. Check Enable Oauth Settings, and set the access to Full Access. After clicking Save, copy and save the Consumer Key and Secret, as we’ll need this in a future step.
Preparing the Dataset in Tableau Prep Builder
After deploying the model in Salesforce, we need to prep the dataflow for automated scoring in Tableau Prep Builder. For those unfamiliar with Prep Builder, it is a tool to build dataflows, similar to the tools offered in Einstein Data Manager. With Tableau Prep Builder, we can connect to files, numerous servers, and databases directly. Our team has already written blogs on Prep Builder, so for the sake of this blog we’re going to assume you’ve already built your dataflow.
From the finalized dataflow, add a Clean Step. Then, click Create Calculated Field. This field is just a placeholder for the time being; simply enter the formula as 0.0. This calculated field will ultimately automate the scoring of your predicted field. It should match the outcome field from the Einstein Discovery Story created earlier. Lastly, take note of the name of calculated field and other dimensions in the dataflow, as we’ll use these in a later step.
Use TabStein to Configure Integration
After deploying the model in Salesforce, the next step is to use TabStein to configure the integration that will allow for automated scoring. TabStein is an application, written by Tableau Solution Engineer John Hegele, downloaded on GitHub. Read more about it here.
After downloading TabStein, start by launching TabStein and selecting the directory where you’d like to save your output files that TabStein will generate. In the next step, we can optionally run dependency checks; it might be a good idea to run this if using the application for the first time, although we can skip this step. Next, we can Set Basic Options. We can set up the Port for TabPy to listen on, the Timeout length, Log Details, and Max Request Size. All come with default selections, which we can stick with.
Then, we’re asked for the Consumer Key, Consumer Secret, Username, and Password from Salesforce. Paste the Consumer Key and Consumer Secret from the Connected App created in an earlier step. Lastly, log in with your Salesforce Username and Password.
In the next step, start by selecting the Discovery Model you’d like to integrate. This step maps the columns from the Einstein Discovery Story to Tableau. Make sure that the column names of the Einstein Discovery Story match the label names of the columns in the Tableau Prep dataset.
Lastly, set the custom server configuration. While we can keep the default settings, note that you may run into issues if you keep the row count to 2000, depending on the dataset size. Our team had more success when decreasing the row count to 200. This is the last step in the TabStein configuration. After clicking Done, two files will be saved to the directory you chose earlier.
Score Records in Prep Builder
For the next step, navigate back to Tableau Prep Builder. As a refresher, in a previous step, we added a Clean Step with a computed field. In this step, start by adding a Script step.
As a prerequisite to the next step, run TabPy in your terminal by simply typing TabPy; recall that installing TabPy was listed as a prerequisite at the beginning of the blog. In Prep Builder, select Tableau Python (TabPy) Server as the Connection type. Click on Connect to Tableau Python (TabPy) Server. In the popup, type localhost into the Server textbox and 9004 into the Port textbox. We can keep the Username and Password blank.
Next, add the prep.py file that was generated in the TabStein step. Lastly, in the Function Name text box, type einstein. After hitting Enter, Prep Builder will generate scores for the rows in your dataset!
After generating scores for the dataset, we’re not quite done yet. To use the scored predicted records in a visualization, publish the dataset by creating an Output Step. We recommend publishing as a datasource, so you can use the dataset in Tableau and start creating visualizations. Select your Server, save it to a Project, and give it a Name. Click Run Flow to finish. The dataset will publish to Tableau Online, and can be used to build a Visualization!
Scheduling Dataset Refresh
At this stage, we can build a Tableau Visualization with Einstein Predictive Scoring. Depending on the use case, we may also want to schedule a dataset refresh at a regular interval. For example, the dataset source may be updated once per week, so we also want to update the Einstein predictions every week. Essentially, the goal is to run the flow we’ve created at a regular interval.
To set this up, start by ensuring the dataset is accessible to Tableau Server in either an on-premise or cloud database or by publishing the dataset to Tableau Server. For this to work, note that Tableau Prep Conductor is a prerequisite, and that TabPy must be installed somewhere accessible by the Tableau Server. Once the Flow is published in Tableau Server, press + Create new task below Schedule, and select your refresh schedule. The flow will now start running at the refresh schedule selected, with Einstein generating new predicted scores for every refresh.
Power of Einstein AI in Tableau
Congratulations! We’ve just completed automating the predictive scoring of records using Prep Builder. We can build a Visualization with Einstein’s predictive scoring, with the dataset being refreshed at a regular interval. Recall that in our last blog, we learned how to build powerful AI-driven What-If Analysis visualizations, powered by Einstein.
By combining these two solutions, we can build an extremely powerful AI Solution in Tableau. We gain insight into the AI predicted scores of records today, which we can take action on, and can better prepare for tomorrow by going through the outcomes of different possible scenarios. The marriage of Einstein and Tableau is truly one to behold!