About Dithering

2025-09-05 - 10 mins read

In my previous post, I quickly when over how I converted images to use with a 6 color screen with dithering, but didn’t get into the details. Now, I want to look deeper into this technique, understand what is dithering and compare different techniques.

This post is accompanied by some python code that I use to experiment, you can find the full repo on Codeberg.

Displaying an image with only a few colors

Before speaking about image dithering, we have to speak about image quantization. In general terms, quantization is the process of mapping values form a large or continuous set of possible values to a reduced one.

For images, this means converting each pixel colors from one color space to another one, usually with fewer colors. This is useful in many cases, for example printing, conversion to a palette-based format like GIF, or working with a color-limited display.

There are many ways to do color quantization, but for this post, I will be only be working from 256 color RGB image, reduced to predetermined palettes.

Implementing basic quantization

Simple quantization can be done by simply, for each pixel of the image, selecting the closest color in the target palette. A simple Euclidean distance is enough for experimenting.

Color = NewType("Color", tuple[int, int, int])

def closest(color, palette: list[Color]):
    return min(
        palette,
        key=lambda c: (color[0] - c[0]) ** 2
        + (color[1] - c[1]) ** 2
        + (color[2] - c[2]) ** 2,
    )

def quantize(image: Image.Image, palette: list[Color]) -> Image.Image:
    data = list(image.getdata())

    data = [closest(c, palette) for c in data]

    converted = Image.new("RGB", image.size)
    converted.putdata(data)
    return converted

Here are some images converted with this method, to a 216 colors “web safe” palette and to a 2 colors black and white palette.

Original	Web safe	Black and white

The issue with just doing this is quickly visible: loss of details and gradients, color bands. The web safe palette still gives acceptable results for a less artificial illustration, but some details are still lost.

However, trying to use the reduced, 6-colors palette of my previous project, the results are terrible.

A row of 6 pixels, in order white, black, green, blueviolet, firebrick red, and yellow — The palette used for the next conversions

This is why we need dithering

Dithering is a way to reduce artifact caused by quantization by adding some sort of noise to the image. Noise is usually something that is avoided in signal processing, but in this case, we can use it to avoid color banding and create the illusion of more colors in the image than is available.

Gray image	Dithered gray image

The right image is built using a simple checkerboard pattern, which appears similar to the left, gray image, when viewed at full size. Zooming on the image reveals the pattern.

A black and white checkerboard pattern — Zooming on the image show the pattern

This is the basic method that makes dithering possible: when looking at an image from sufficiently far away, individual pixels seems to merge and become a uniform color. Using this, we can create different value of gray with only black and white pixels.

25% gray	50% gray	75% gray

A note on dithering and antialiasing

I am more familiar with antialiasing, another way to reduced artifacts in images. As antialiasing helps with artifacts introduced when changing the scale of an image by introducing new colors not present in the original, dithering helps with artifacts introduced when changing the colors of an image by introducing details not present in the original.

In a more formal way, while antialiasing reduces high-frequency signals in the image when limiting the resolution (or time) domain, dithering introduces high-frequency signals when reducing the value domain.

Noise based dithering

When I say that dithering is adding noise to an image, it is quite literal in this first dithering method: we can add random noise to the pixel colors before looking for the closest one in the palette, which can sometimes leads to other, less close colors being selected.

def noise_dither(image: Image.Image, palette: list[Color], strength: int):
    data = list(image.getdata())

    def add_noise(color: Color) -> Color:
        return Color(
            (
                color[0] + random.randrange(-strength, strength),
                color[1] + random.randrange(-strength, strength),
                color[2] + random.randrange(-strength, strength),
            )
        )

    data = [closest(add_noise(c), palette) for c in data]

    converted = Image.new("RGB", image.size)
    converted.putdata(data)
    return converted

In this case, selecting the correct noise strength is important, and depends on the target palette and the source image.

Testing in black and white

Looking at the black and white gradient, converted to only two colors, give an on why this works: the closer to gray the colors, the more chance there are for the closest colors to switch from white to black and vice versa, while pure black or white is guaranteed not to switch except with extreme noise. However, a good effect can be achieved with a lower noise value when targeting the web-safe palette.

Noise 50	Noise 100	Noise 150

We can even go as low as a strength of 20 for this gradient while still having a good quality result, on the largest palette

A gradient produced by random pixels of color. This image is the less noisy. Band of solid gray are visible

Testing in color

The method also works with colors, and more detailed images, for the larger palette.

Noise 20	Noise 50	Noise 100

We can see that the result depends a lot on the source image. The color gradient is quite recognizable at noise level 50, while the illustration is already too noisy. Reducing the noise even further for the illustration, compared to no noise:

Original	No dithering	Noise dithering

And the 6 colors palette?

Testing with the 6 colors palette, however, the results are… not good.

Noise 20	Noise 50	Noise 100

Using a high level of noise does give a sense of the original colors, but the added noise also destroys a lot of the details at the same time. Low noise level however do not have enough chances to change to color to give good results either.

We need to find something else.

Error diffusion dithering

Until now, we only considered a single pixel at a time when dithering our images. Maybe we could take neighboring pixels into account? Enter the error diffusion method.

This method introduces an additional error map to the process, and a diffusion matrix, in order to “spread” the difference between the source and target colors over multiple pixels. This error is accumulated into the error map, and added to the following pixels being processed.

The result is a more controlled changing of colors, with flat colors being replaced by a pattern of available target colors.

However, this method requires a bit more code, and is not without issues, as we will see. But first, the code.

def error_diffusion(image: Image.Image, palette: list[Color],
                    diffusion_matrix_choice: ErrorDiffusionMatrix):
    data = np.array(image, dtype=np.int32)
    error_map = np.zeros(data.shape)

    diffusion_matrix = diffusion_matrix_choice.matrix()
    diffusion_width = diffusion_matrix.shape[0]
    diffusion_height = diffusion_matrix.shape[1]

    for i in range(data.shape[0]):
        for j in range(data.shape[1]):
            closest_color = closest(data[i, j] + error_map[i, j], palette)
            error = data[i, j] + error_map[i, j] - closest_color

            for k in range(diffusion_width):
                for l in range(diffusion_height):
                    x = i + k - diffusion_width // 2
                    y = j + l - diffusion_height // 2
                    if 0 <= x < data.shape[0] and 0 <= y < data.shape[1]:
                        error_map[x, y] += diffusion_matrix[k, l] * error

            data[i, j] = [closest_color[0], closest_color[1], closest_color[2]]

    converted = Image.fromarray(data.astype('uint8'), 'RGB')
    return converted

We are required to work with 2d matrices in this case, so I converted the image into a NumPy array for this. The diffusion matrix should be centered on the target pixel, but this is only for ease of coding here. Because we are converting pixels from the top left to the bottom right, the top of the matrix will always be zero, as we cannot accumulate error for pixels already changed.

There are many diffusion matrices than can be used, the most famous ones being the Floyd-Steinberg and the Atkinson. But first, I will use a basic matrix to explore the method.

$$ \begin{bmatrix} 0 & 0 & 0 \\ 0 & * & 2 / 5 \\ 0 & 2 / 5 & 1 / 5 \end{bmatrix} $$

This basic matrix adds two fifth of the difference to the pixel right and bottom of the target, and one fifth to the pixel bottom right.

Running on our images, we get the results:

Web Safe	Black and white	6 colors

This seems to be the best method until now, across all the tested palettes. By adding the error to the neighbor pixels, the algorithm is able to compensate missing colors quite well, even using a mix of red, green and blue to approximate the grays needed for the black to white gradient.

Of course, this is just a basic matrix, built by myself without much thought. Designed palettes gives a better result.

Floyd-Steinberg matrix

$$ \begin{bmatrix} & & \\ & * & \frac{7}{16} \\ \frac{3}{16} & \frac{5}{16} & \frac{1}{16} \\ \end{bmatrix} $$

Published by Robert W. Floyd and Louis Steinberg in 1976, this matrix is designed so that “a region of desired density 0.5 come out as a checkerboard pattern”.

Let’s try!

A dithered gray image. Most of it is in a checkerboard, but there are some artifacts — Not a perfect checkerboard

This is not quite perfect, and I am not sure if it is caused by my implementation as I also tried with Image Magick with similar results.

Still, we are not here to look at gray images only. What about our tests images?

Web Safe	Black and white	6 colors

I think it looks more pleasant, with fewer blocks of similar colors. The bottom-left diffusion helps quite a lot I think, to spread the error more smoothly.

Atkinson matrix

$$ \begin{bmatrix} & * & \frac{1}{8} & \frac{1}{8} \\ \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \\ & \frac{1}{8} & & \\ \end{bmatrix} $$

Designed by Bill Atkinson and used in the original Macintosh computer, this dithering is emblematic of this system, with it’s deep blacks and whites.

Contrary to the previous matrices we looked at, this one only diffuse 3⁄4 of the error, which result in the effect.

Web Safe	Black and white	6 colors

This one seems to struggle the most with the 6 colors palette, but it may still look good once on the target display. Due to my test image being quite dark, there are also a lot of details lost when converted in black and white, as could be expected.

There are many different dithering matrices, with their pros and cons, which I will not look at here. However, many of them are covered by Tanner Helland in his article Image Dithering: Eleven algorithms and source code

Issues with error diffusion

Error diffusion dithering, while giving great results visually, is not without issue. The first one we have seen is that error can be spread out far into the image, causing some ghosting in some cases.

It also cannot be parallelized, as previous pixels influence later ones. This also makes it unsuitable for animations, because a single pixel change can ripple out to the whole image:

Original	Dithered

What’s next?

If we had a method that could both work on each pixel independently, but at the same time preserve details in the resulting image, we could have the best of both worlds. Also, none of these methods can work with animation without introducing a lot of changes each frame, which is not suitable.

Still, noise dithering is a good basic method for dithering images without high frequency details, while ordered dithering is quite good for photos or illustrations.

The next step I will look at, however, gives much more creative control over the result to the algorithm, allowing us to add specific effects to the result.

I am speaking, of course, about ordered dithering, which I will explore in details in another post.