Case classification using Einstein Intent API

This is a quick 30 min tutorial to showcase Einstein Intent API. At the end of this exercise, you will have a good understanding of Intent API usage and its integration to Salesforce use cases.

Usecase

Using Intent API we will classify cases based on the description. You can use the sample below to build a dataset with labels.

E.g

Case Description Case Type
ac is making too much noise in the meeting room HotCold
I saw mice in the kitchen. We need to remove them. PestControl
wooden molding on the right side of wall mounted tile has been pulled loose InteriorRepair
the hand activated soap dispenser in the restroom is not working Janitorial
remove minor weeds in yard Landscaping
security gate in front entrance is not opening ParkingLot

Uploading the Dataset

Customers using Case object provide a lot of details in the case description. Imagine the intent API is able to understand the context and route to the right agent for support. First step is to load the dataset. Follow the instructions here to generate a token.

For this exercise, we used a case dataset with 9K records that are classified into the five buckets.

Request:

curl -X POST -H “Authorization: Bearer 7YYRSIOR6A67W6JYMC7U3676IVGONH3TLR5XCSCY4DAZ23BDCH5I7WMG6S4MZL6ZOYOUQXIT2QQJDL2563D55NGKFBKYNHG7VFJEKVY” -H “Cache-Control: no-cache” -H “Content-Type: multipart/form-data” -F “path=https://path/case_demo.csv” -F “type=text-intent” https://api.einstein.ai/v2/language/datasets/upload

Response:

{

 “id”: 1019732,

 “name”: “case_demo.csv”,

 “createdAt”: “2017-11-13T22:38:37.000+0000”,

 “updatedAt”: “2017-11-13T22:38:37.000+0000”,

 “labelSummary”: {

   “labels”: []

 },

 “totalExamples”: 0,

 “available”: false,

 “statusMsg”: “UPLOADING”,

 “type”: “text-intent”,

 “object”: “dataset”

}

Dataload takes only a few secs and the status changes to SUCCEEDED. The summary will also display the label values.

Request:

curl -X GET -H “Authorization: Bearer 7YYRSIOR6A67W6JYMC7U3676IVGONH3TLR5XCSCY4DAZ23BDCH5I7WMG6S4MZL6ZOYOUQXIT2QQJDL2563D55NGKFBKYNHG7VFJEKVY” -H “Cache-Control: no-cache” https://api.einstein.ai/v2/language/datasets/1019732

Response:

{

 “id”: 1019732,

 “name”: “case_demo.csv”,

 “createdAt”: “2017-11-13T22:38:37.000+0000”,

 “updatedAt”: “2017-11-13T22:39:07.000+0000”,

 “labelSummary”: {

   “labels”: [

     {

       “id”: 174375,

       “datasetId”: 1019732,

       “name”: “PestControl”,

       “numExamples”: 388

     },

     {

       “id”: 174376,

       “datasetId”: 1019732,

       “name”: “Janitorial”,

       “numExamples”: 1237

     },

     {

       “id”: 174377,

       “datasetId”: 1019732,

       “name”: “ParkingLot”,

       “numExamples”: 579

     },

     {

       “id”: 174378,

       “datasetId”: 1019732,

       “name”: “Landscaping”,

       “numExamples”: 534

     },

     {

       “id”: 174379,

       “datasetId”: 1019732,

       “name”: “InteriorRepair”,

       “numExamples”: 4415

     },

     {

       “id”: 174380,

       “datasetId”: 1019732,

       “name”: “HotCold”,

       “numExamples”: 2037

     }

   ]

 },

 “totalExamples”: 9190,

 “totalLabels”: 6,

 “available”: true,

 “statusMsg”: “SUCCEEDED”,

 “type”: “text-intent”,

 “object”: “dataset”

}

Train the model

Once we have the dataset uploaded, we can use the intent model to start training on the dataset.

Request:

curl -X POST -H “Authorization: Bearer 7YYRSIOR6A67W6JYMC7U3676IVGONH3TLR5XCSCY4DAZ23BDCH5I7WMG6S4MZL6ZOYOUQXIT2QQJDL2563D55NGKFBKYNHG7VFJEKVY” -H “Cache-Control: no-cache” -H “Content-Type: multipart/form-data” -F “name=SML Case” -F “datasetId=1019732” https://api.einstein.ai/v2/language/train

Response:

{

 “datasetId”: 1019732,

 “datasetVersionId”: 0,

 “name”: “SML Case”,

 “status”: “QUEUED”,

 “progress”: 0,

 “createdAt”: “2017-11-13T22:52:18.000+0000”,

 “updatedAt”: “2017-11-13T22:52:18.000+0000”,

 “learningRate”: 0,

 “epochs”: 0,

 “queuePosition”: 1,

 “object”: “training”,

 “modelId”: “JUZJKVI5LCAD3IPQSABP5PACJ4”,

 “trainParams”: null,

 “trainStats”: null,

 “modelType”: “text-intent”

}

Check the status of training

Request:

curl -X GET -H “Authorization: Bearer 7YYRSIOR6A67W6JYMC7U3676IVGONH3TLR5XCSCY4DAZ23BDCH5I7WMG6S4MZL6ZOYOUQXIT2QQJDL2563D55NGKFBKYNHG7VFJEKVY” -H “Cache-Control: no-cache” https://api.einstein.ai/v2/language/train/JUZJKVI5LCAD3IPQSABP5PACJ4

Response:

{

 “datasetId”: 1019732,

 “datasetVersionId”: 12671,

 “name”: “SML Case”,

 “status”: “RUNNING”,

 “progress”: 0.01,

 “createdAt”: “2017-11-13T22:52:18.000+0000”,

 “updatedAt”: “2017-11-13T22:54:21.000+0000”,

 “learningRate”: 0,

 “epochs”: 1000,

 “object”: “training”,

 “modelId”: “JUZJKVI5LCAD3IPQSABP5PACJ4”,

 “trainParams”: null,

 “trainStats”: null,

 “modelType”: “text-intent”

}

In our test, model training takes time e.g 9K records took over 45 mins to complete. Once complete status changes to SUCCEEDED.

{

 “datasetId”: 1019732,

 “datasetVersionId”: 12671,

 “name”: “SML Case”,

 “status”: “SUCCEEDED”,

 “progress”: 1,

 “createdAt”: “2017-11-13T22:52:18.000+0000”,

 “updatedAt”: “2017-11-13T23:18:28.000+0000”,

 “learningRate”: 0,

 “epochs”: 1000,

 “object”: “training”,

 “modelId”: “JUZJKVI5LCAD3IPQSABP5PACJ4”,

 “trainParams”: null,

 “trainStats”: {

   “labels”: 6,

   “examples”: 9190,

   “totalTime”: “00:26:08:444”,

   “transforms”: null,

   “trainingTime”: “00:25:52:412”,

   “earlyStopping”: true,

   “lastEpochDone”: 62,

   “modelSaveTime”: “00:00:01:387”,

   “testSplitSize”: 1872,

   “trainSplitSize”: 7318,

   “datasetLoadTime”: “00:00:16:032”,

   “preProcessStats”: null,

   “postProcessStats”: null

 },

 “modelType”: “text-intent”

}

Review Model Metrics

Request:

curl -X GET -H “Authorization: Bearer 7YYRSIOR6A67W6JYMC7U3676IVGONH3TLR5XCSCY4DAZ23BDCH5I7WMG6S4MZL6ZOYOUQXIT2QQJDL2563D55NGKFBKYNHG7VFJEKVY” -H “Cache-Control: no-cache” https://api.einstein.ai/v2/language/models/JUZJKVI5LCAD3IPQSABP5PACJ4

Response:

{

 “createdAt”: “2017-11-13T23:18:28.000+0000”,

 “metricsData”: {

   “f1”: [

     0.9433962264150942,

     0.7991967871485944,

     0.7936507936507935,

     0.6878306878306878,

     0.931434599156118,

     0.9667896678966788

   ],

   “labels”: [

     “PestControl”,

     “Janitorial”,

     “ParkingLot”,

     “Landscaping”,

     “InteriorRepair”,

     “HotCold”

   ],

   “testAccuracy”: 0.9027777910232544,

   “trainingLoss”: 0.14035213409878983,

   “trainingAccuracy”: 0.949576407508076,

Testing the Model

Once the model is trained, we can quickly test and review the results.

Request:

curl -X POST -H “Authorization: Bearer XVOUXYGXYC4OVR2Z23ZPGKC6MT7E4KQVT5HR5WHCI3ZIBQ5DYYVG3KKUEJMUSAH4SOVE2JO57S2QNWQVBR44U25F3C5VP4QAUZVJCHA” -H “Cache-Control: no-cache” -H “Content-Type: multipart/form-data” -F “modelId=JUZJKVI5LCAD3IPQSABP5PACJ4” -F “document=Meeting room need to be painted” https://api.einstein.ai/v2/language/intent

Response:

{

 “probabilities”: [

   {

     “label”: “InteriorRepair”,

     “probability”: 0.99917185

   },

   {

     “label”: “Janitorial”,

     “probability”: 0.00042642586

   },

   {

     “label”: “PestControl”,

     “probability”: 0.00013708863

   },

   {

     “label”: “Landscaping”,

     “probability”: 0.00010684224

   },

   {

     “label”: “ParkingLot”,

     “probability”: 0.000100775476

   }

 ],

 “object”: “predictresponse”

}

Integration to Force.com

Once we have the API working, easy way is to use Salesforce Einstein Apex wrapper to make the API calls.
The same apex class can be used in the trigger so that when a case is created, we can automatically update the case type.

Things to consider:

  1. We highly recommend perform feature engineering to remove stop words. Keeping the sentences relevant to classification will result in high accuracy of the model.
  2. Upload in batches for large datasets. Here are the details for file size considerations.
  3. Advanced users can use the train params to change the split ratio for train and test the data.
  4. Use the model metrics to understand the accuracy. Based on the results, apply appropriate feature engineering to retrain the model and upload new dataset.