If you want to follow along, download this post and run it in the Jupyter Notebook.
While working on a post about implementing interpolation from scratch, I wanted to check my 2D bicubic interpolation against some existing libraries. Surprisingly, they don't agree!
I haven't gotten to the bottom of why, but I thought it worthwhile to document with an example. Let's import the tools we need:
%config InlineBackend.figure_formats = ['retina']
%config InlineBackend.print_figure_kwargs = {'facecolor': (1.0, 1.0, 1.0, 0.0)}
import numpy as np
import matplotlib.pyplot as plt
plt.rc('image', cmap='PuOr', origin='lower')
plt.rc('figure', figsize=(5, 2))
import doodads as dd # my personal utility library, used only for plotting
Let's construct a signal that should interpolate fairly well. We're going to use sinusoids well within the band-limit of the image, so it's theoretically possible to interpolate and upsample without introducing aliasing*.
* Under some assumptions like infinite periodicity and using sinc interpolation... we're not doing that. But we should get close!
def f(xx, yy):
phase = np.pi/6
return 2 * np.cos(xx + phase) * np.cos(yy / 2) * np.cos(yy)
samples = 10
xmin, xmax = 0, samples - 1
dx = 1
img_extent = [xmin - dx/2, xmax + dx/2, xmin - dx/2, xmax + dx/2]
coords = np.arange(samples)
xx, yy = np.meshgrid(coords, coords)
img = f(xx, yy)
plt.imshow(img, extent=img_extent)
Here we've evaluated $f(x, y) = \cos(x / 2) \cos (y / 2) \cos(y)$ at the centers of pixels $0 \le x \le 9$ and $0 \le y \le 9$. (Note the extent=
argument to imshow
making the tick marks land in the right spot, taking into account the pixel extent from $x - 0.5$ to $x + 0.5$.)
Since it's analytic, we can make a "truth" signal that's sampled 100x more finely in $x$ and $y$.
upsample = 100
upsample_coords = np.linspace(-dx/2, samples - dx / 2, upsample * samples)
upsample_coords = np.arange(upsample * samples) / upsample - 0.5 # subtract half a pixel to account for finite size
Here we plot the coarsely-sampled signal on the left and the more finely sampled "truth" signal on the right.
xx, yy = np.meshgrid(upsample_coords, upsample_coords)
upsampled_truth = f(xx, yy)
plt.subplot(121)
plt.imshow(img)
plt.subplot(122)
plt.imshow(upsampled_truth, extent=img_extent)
We can also pull out one row of the coarsely-sampled signal and plot it signal-processing style:
plt.stem(coords, img[0])
The first row of the image (i.e. $y=0$) should be row 50 in our upsampled image indexing:
np.argwhere(upsample_coords == 0)
We can overplot the "truth" signal and make sure it passes through the original points, as a check:
plt.stem(coords, img[0])
plt.plot(upsample_coords, upsampled_truth[50])
Now, let's upsample the original coarsely sampled image using interpolation and see if it agrees with the truth! The new image size is just the original size multiplied by upsample
.
dsize = (img.shape[0] * upsample, img.shape[1] * upsample)
scikit-image interpolation¶
Scikit-image requires some additional options get it to behave as (I) expected. Unlike linear interpolation, cubic interpolation can produce values outside the range of the input—that is well-known. However, the scikit-image interpolation functions require you to pass clip=False
to get the expected behavior.
First let's look at the default behavior (clip=True
):
# scikit-image
from skimage import transform
upsampled_skimage_bad = transform.resize(img, dsize, order=3)
dd.show_diff(upsampled_skimage_bad, upsampled_truth,
as_percent=False, extent=img_extent, vmax=0.25, colorbar=True)
print(f"{np.min(img)=:1.3}\n{np.max(img)=:1.3}")
print(f"{np.min(upsampled_truth)=:1.3}\n{np.max(upsampled_truth)=:1.3}")
print(f"{np.min(upsampled_skimage_bad)=:1.3}\n{np.max(upsampled_skimage_bad)=:1.3}")
Above, you can see that the upsampled_skimage_bad
array has identical minimum and maximum pixel values to the original image. The $f(x, y)$ we defined has extrema that don't perfectly align with a sample location, so imposing this clip=True
constraint means we'll definitely disagree with the "truth" image.
The plot is a difference image, subtracting the "ground truth" from the interpolation. You can see a couple of little "pimples" where the clipping has moved a predicted value and made it a worse approximation. For our purposes, we want clip=False
.
With that pre-requisite dealt with, we can resize the image and compare the interpolation to the ground truth.
upsampled_skimage = transform.resize(img, dsize, order=3, clip=False)
Now, we plot the interpolated image and truth image on identical color scales, as well as the difference image. Below, values from row 0 of the interpolated image (orange) are overlaid on the original and finely-sampled truth image.
def comparison_plot(title, orig, truth, upsampled, row=0):
fig, axs = plt.subplot_mosaic('abc\nddd', figsize=(8, 4))
axs['d'].stem(coords, orig[row])
upsampled_idx = upsample // 2 + row * upsample
axs['d'].plot(upsample_coords, truth[upsampled_idx])
axs['d'].plot(upsample_coords, upsampled[upsampled_idx])
axs['d'].set(
xlabel='original pixel column index'
)
dd.three_panel_diff_plot(
upsampled, truth,
as_percent=False,
extent=img_extent,
colorbar=True,
ax_a=axs['a'], title_a=title,
ax_b=axs['b'], title_b='ground truth',
ax_aminusb=axs['c'],
diff_kwargs=dict(vmax=0.25),
)
comparison_plot('scikit-image', img, upsampled_truth, upsampled_skimage)
Performance at the edges is worse, as may be expected. The bicubic interpolation domain needs $4 \times 4$ pixel values, and the edge pixels only have that on one or two sides.
Of course, choosing the mode=
differently will give you different behavior at the hairy edges. None of these options is more "right" than the others but you should know that argument exists in case you need it.
upsampled_skimage_symm = transform.resize(img, dsize, order=3, clip=False, mode='symmetric')
upsampled_skimage_const = transform.resize(img, dsize, order=3, clip=False, mode='constant')
fig, axs = plt.subplots(ncols=3, figsize=(10, 2))
dd.show_diff(upsampled_truth, upsampled_skimage, as_percent=False, extent=img_extent, vmax=0.25, colorbar=True, ax=axs[0])
dd.show_diff(upsampled_truth, upsampled_skimage_symm, as_percent=False, extent=img_extent, vmax=0.25, colorbar=True, ax=axs[1])
dd.show_diff(upsampled_truth, upsampled_skimage_const, as_percent=False, extent=img_extent, vmax=0.25, colorbar=True, ax=axs[2])
axs[0].set_title("reflect")
axs[1].set_title("symmetric")
axs[2].set_title("constant")
Aside: scikit-image and data types¶
There's another footgun to be aware of in scikit-image when working with camera images, which are not floating-point by default. Since our sample signal is floating-point, illustrating this require defining a new sample image mini_img
with dtype=np.uint16
(like a 10- or 12-bit camera might produce):
mini_img = np.array([
[2, 3, 4, 1],
[4, 1, 3, 2],
[3, 4, 2, 1],
[2, 1, 1, 3],
], dtype=np.uint16)
plt.colorbar(plt.imshow(mini_img, vmin=0))
mini_img_upsample = transform.resize(mini_img, (32, 32), order=3, clip=False)
mini_img_upsample_preserve = transform.resize(mini_img, (32, 32), order=3, preserve_range=True, clip=False)
fig, axs = plt.subplots(ncols=2, figsize=(7, 2))
axs[0].set(title='preserve_range=False')
cb1 = fig.colorbar(axs[0].imshow(mini_img_upsample))
cb1.formatter.set_scientific(False)
axs[1].set(title='preserve_range=True')
fig.colorbar(axs[1].imshow(mini_img_upsample_preserve))
The image on the left uses the default behavior and the image on the right uses the preserve_range=True
option. The one on the left has pretty dramatically different values because it has been automatically rescaled from the dynamic range of a 16-bit integer into the interval [0, 1] of a floating point representation. In astronomical image processing this is not usually what you want.
So, we said above we don't want to blindly preserve the range of the inputs by clipping, but now we must supply preserve_range=True
to actually get the desired behavior.
This is somewhat confusing.
In any case, official scikit-image documentation even says not to use floating point images with values beyond the $[-1, 1]$ interval, so maybe this library is best left for the computer vision people.
OpenCV¶
Speaking of computer vision people, that's what the CV in "OpenCV" stands for. They implement the core algorithms of computer vision as a compiled C++ library with Python bindings, so they're quite fast.
My friend Dr. Logan Pearce relies on their implementation of interpolation, so let's see how it stacks up.
# OpenCV
import cv2
upsampled_cv = cv2.resize(img, dsize, interpolation=cv2.INTER_CUBIC)
comparison_plot('OpenCV', img, upsampled_truth, upsampled_cv)