Finding visual defects using difflow

Written by Zach Hawtof

How do you ensure that the CSS on your site is rendering properly? We spend so much time testing the functionality of our applications that we sometimes forget to check the user interface. Today, tools like Google’s Dpxdt and Facebook’s Huxley can visually compare images. However, none of these tools would effectively operate with our workflow for new releases. We needed something that could run against our already existing C# tests and accurately display multiple ‘diffed’ images throughout the entire test run.

Using Sauce Labs’ cloud-based Selenium integration for multiple browser-platform testing, we started an open source product called Difflow that aids in identifying visual defects in our products. Once the server-side is started, Difflow pulls screenshots from functional tests run on Sauce and compares them perceptually to previous baseline images in order to create a diffed image.

I built Difflow in Node.js using the web application framework Express and the templating language Jade. In addition to the aforementioned packages installed via the Node Package Manager (npm), Difflow relies heavily on GraphicsMagick for image processing and pixel comparison. MongoDB was used to persist data.

A snippet from the Jade file for the test index demonstrating how tests are rendered to demonstrate a successful or failed visual test. The threshold set by the user decides the sensitivity of the difference between the baseline and the test image.

The central object within the system is the timeline, which stores a collection of screenshots meeting the following criteria: the same test, the same step within the test, and the same browser configuration in Sauce. This screenshot timeline is then updated with future screenshots of a specific step within a test.

Jade File

The central object within the system is the timeline, which stores a collection of screenshots meeting the following criteria: the same test, the same step within the test, and the same browser configuration in Sauce. This screenshot timeline is then updated with future screenshots of a specific step within a test.

If a user ends up changing part of the UI, then the test image will now be the correct view of the website instead of the baseline. Therefore, a post call can be made to change the timeline’s baseline screenshot to now be the test image screenshot. There is a GUI button that can do this as well.

Baseline screenshot

If a user ends up changing part of the UI, then the test image will now be the correct view of the website instead of the baseline. Therefore, a post call can be made to change the timeline’s baseline screenshot to now be the test image screenshot. There is a GUI button that can do this as well.

For ease of use, an elementary GUI was developed to manage workflow. A user can view success and failure steps within the tests, as well as upload their own pairs of photos for ad hoc comparison.

The homepage for Difflow. The homepage for Difflow.

Side-by-side comparison of the test image, the baseline, and the diffed image for an individual screenshot.

Side-by-side comparison of the test image, the baseline, and the diffed image for an individual screenshot.

A diffed image showing differences between the baseline and test image. A diffed image showing differences between the baseline and test image.

Visual regression testing is becoming more and more of a necessity. It’s essential that companies keep their user experience professional and operational, which means CSS rendering incorrectly is no longer acceptable. A broken webpage, discolored image, or out-of-place menu item can create a very damaging impression for the user if not caught ahead of time.

This new process eliminates human error when testing our site visually and promotes the efficiency, accuracy, and speed for our test engineering team. Difflow is nowhere near finished, and we’ll continue to improve it.