Automated Visual Inspection Using Tensorflow

Automated Visual Inspection is a very time-consuming and tedious task for manufacturing quality assurance teams.  Due to the limited resources of a manufacturing facility, anything that can be automated helps cut down cost and time and frees up the operations team to focus their attention on other tasks. What If we could train an algorithm to do this task for us with a degree of accuracy?


One of our most recent projects we were asked to identify dents on cans while moving at high speeds on a conveyor belt.   We chose to use object detection to search for dents in an image frame by frame from a video.

Technology Stack

Coding Language:

  • Python

Python Packages:  

  • Open CV
  • Tensorflow

Dent ProcessProcess

A high speed camera was used to capture any dents with high resolution and high frames. When we didn’t use a high speed camera, dents would be blurry and indistinguishable. This was an example of annotating a blurry image with  labelIMG.

Because our client was experimenting with high-speed camera footage, it made our jobs a lot easier.  What was also helpful was converting the images to grayscale, which made the dents easier to detect for a trained computer algorithm to identify.  ct.

high speed camera footage

Here are some models that we tried:

Faster RCNN ResNet 101

Using the pretrained model checkpoint from Model Zoo, Faster RCNN was initially used to train the model because of its high MAP score and accuracy in identifying objects, even at different sizes with the Region Proposal Network.  


  • Highly accurate and can pick up dents even if they are small


  • Inference speed was significantly slow for detection


Single Shot Detector was tested for the speed.   So the data was retrained on a different pretrained checkpoint.


  • Speed greatly increased for detection and can process over 10 frames per second


  • Lacked the accuracy of detecting dents as Faster RCNN.  Smaller dents were more difficult to detect.


In the course of testing, we ascertained the time to inference for various models and types of input using FPS (Frames Per Second) as our metric. Our initial model (SSD) running on an RTSP stream from LineSpex:

  • Stream without displaying output: 15.6419
  • Stream displaying output: 15.5600
  • Model running on stream (not displaying output): 15.4988
  • Model running on stream and displaying output: 15.4329

We identified an action item to add benchmarks for recorded video.

Below is a video of our work applied to a video segment using Faster RCNN. 

We drew detection lines where if a good can passed by, it would be green.  If there was a dent detected then the detection lines flash a different color indicating that there was a dent found.

An important thing to notice is that not only object detection was able to capture the dent, it’s the lack of false positives as well, since the lights did not flash on a good can.  That way, good cans would not be pulled off the factory line.

Problems faced:

  • Glare from the can sometimes would interfere with the detection.  We made sure we had just the right lighting to mitigate that.
  • Dents come in all shapes and sizes, so we scratched up a bunch of cans and took pictures of it at different distances.

Live Feed

We also experimented with live feed to see if we could get as close to production settings as possible.  We tried the webcam feed from a laptop to demo the idea of a live feed. With a GPU enabled laptop, we were able to get reasonable to speeds to nearly keep up with the processing speed.      

Live Feed 1
Live Feed 2
Live Feed 3


To garner more focus, accuracy was chosen rather than speed at this point of the project.  We wanted to be able to generalize and detect dents beyond the initial video footage given to us by the customer.  To do this, we generated our own photage and labeled them for training. We tried to emulate factory settings as close as possible with distance from the camera to the can and lighting as well.  

Retraining 1
Retraining 2
Retraining 3

Afterwards we generated about 800 images and around 2600 annotations to create a more generalized model.  After 10k iterations, we decided created some metrics to see how we are doing. As a result, the model was better able to handle different situations, resembling what we would see on the manufacturing line.  

Proposed Future Improvements

  • Differentiate between a dent and a cut.  That may make things easier by creating a separate class for those two instead of lumping them together into one class.  That way, when there is a train/test split, there will be equal representation between the two classes.
  • This would require re-annotation, but using the Mask RCNN code from Matterport like this example here written by my colleague.  Using this would be the hopes that the polygon lines would be able to conform to the shape of the dent.  Bounding boxes might capture extraneous features that we do not want.

Current Project Results

SpringML has been able to accurately identify dents on cans on a moving high speed conveyor belt.  As the projects continues, we will be experimenting with lighting, distance, camera feed, camera orientation, and video quality.