Edge Detection

For many of us, it is easy to find both the objects in a picture and their boundaries. Humans are very good at identifying what something is and where it is given a clear line of sight, but what if we need to find the boundaries of many objects in an enormously large number photos? This is too much work for one person. Why don't we just make a computer find the edges for us instead? In computer vision, this is a general problem known as edge detection.

Formally, edge detection is the process of detecting the boundaries in an image that separate great changes in brightness or intensity. One of the reasons for doing edge detection is that it is able to reduce a complex image down to a simpler one where only edges are marked. Because of this, edge detection often acts as a preliminary step to other computer vision processes.

Image Representation

Before jumping into edge detection, let's go over how computers represent images. Computers generally store images as grids of values or pixel arrays. A color image, for instance, is decomposable into three pixel arrays storing the RGB color channels. While RGB is convenient for representing color, edge detection is only interested in looking at intensity values. Fortunately for us, it is possible to directly compute intensity from RGB values, giving us a pixel array of grayscale intensities to start working with.

From RGB to Intensity

The conversion from RGB to intensity is a somewhat involved process that delves into color science and human perception. While RGB is a helpful model, in reality, humans do not perceive color linearly or equally among the three channels. To convert RGB to intensity, each RGB value must first be mapped into a linear color space. These values are then put through a weighted sum and converted back to our non-linear RGB color space for display.

Finding Edges

How do we begin searching for edges? We need some natural intuition for what an edge is. To aid us in this question, let us take a step back and re-imagine our problem. Suppose our pixel array is put into 3D space as a topological map where the intensity of each pixel becomes some height. Looking at the example below, it becomes clear that edges are analogous to sharp drops and inclines on this map.

How do we locate areas with these sharp changes in height? Let's pick some point on this map. Knowing our point’s height alone is relatively unhelpful since although it may tell us how high up it is, we get no information about how the height has changed around it. What if, in addition, we also considered the heights of points a small radius away from our chosen point? Ah-hah! Now by comparing these nearby heights, we can get some measurement of the height changes around our point!

This intuition is the driving principle behind a well-known edge detector called the Sobel operator. The operator works by dragging two kernels or filters called Sobel X and Sobel Y across an image to calculate horizontal and vertical intensity changes respectively. For each target pixel in the image, the kernel selects and performs an operation on a neighborhood of surrounding pixels. This compares the intensity of the neighboring pixels with each other and uses them to compute the change in intensity at the target pixel. To demonstrate this, below is an interactive example using the Sobel X and Sobel Y filters. Drag the filters around and see how different areas reflect different intensity changes.

The Math Behind the Filter

Under the hood, filters are grids of numerical weights that are matched up to each neighborhood of pixels. The filter weights and the pixel intensities are used to calculate a weighted sum that represents the change at each target pixel. This process of dragging a grid across a larger grid is called cross correlation. Doing this with a flipped filter is a slightly different process called convolution. In general, we prefer doing convolutions because of their nicer properties, such as being able to compose a series of convolutions into a single convolution.

Change as a Vector

The end result of applying both the Sobel X and Sobel Y filters to our image is two new pixel arrays containing values that represent horizontal and vertical changes respectively. These values can also be thought of as the X and Y components for a grid of 2D arrows or vectors. The vectors gotten using the Sobel operator are known as gradients. For a given pixel, the magnitude of the gradient tells us how much overall intensity change occurs at that pixel and the angle tells us the general direction of that intensity change.

Below is a demonstration that illustrates these gradients. Draw edges or select one of the preset images and observe how the direction and magnitude of the gradient vectors change in response.

So far, our algorithm consists of taking a color image, converting it to grayscale intensities, and applying the Sobel operator to it to find the image's gradients. Observing the magnitudes of these gradients at the end of the pipeline below, we can already begin to see edges manifest as brighter pixels where intensity change is strong.

Problems and Shortcomings

Plugging images through this algorithm, you may have already noticed that Sobel by itself does not perform very well. Our edge detection is not very good at producing clean and precise edges, nor is it very good at dealing with grainy images.

To make edge detection robust, several pre and post-processing steps are required along with Sobel. In fact, our current algorithm with these additional steps was formulated in 1986 by John Canny and is what we know as Canny edge detection. The full Canny edge detection pipeline can be seen below.

Since the advent of Canny edge detection, many more methods of edge detection have been proposed and experimented with such as second-differential approaches to looking at intensity or physics-based approaches to name a few.

Summary

Edge detection is the problem of locating the boundaries among objects in a scene. The core principle behind locating edges is to look for areas of sharp intensity change which can be achieved with the Sobel operator. While this is a start to finding edges, there are many more steps involved in producing clean and reliable edges. The culmination of these steps is an early edge detection algorithm called Canny edge detection which still remains in popular use today.