Fun with Filters and Frequencies

Overview

This project explores the use of image frequency manipulation to modify images and produce compelling visual effects. Starting from applying filters to blur and sharpen features, we are then able to create hybrid and blended images by combining different frequencies and utilizing Gaussian and Laplacian stacks.

Finite Difference Operator

An edge detection algorithm was implemented by using the finite difference operators $Dx$ and $Dy$.

$$ D_x = \begin{bmatrix} 1 & -1 \end{bmatrix}$$ $$ D_y = \begin{bmatrix} 1 \\ -1 \end{bmatrix}$$

$Dx$ calculates changes in intensity of adjacent pixels in the horizontal direction, while its transpose, $Dy$ calculates changes in the vertical direction. Thus, convolving the image with $Dx$ and $Dy$ produces the partial derivatives in $x$ and $y$ respectively. These partial derivatives can then be combined to compute the gradient magnitude $G_m$ which shows the strength in changes in intensity.

$$\frac{\partial Img}{\partial x} = Img * D_x $$ $$\frac{\partial Img}{\partial y} = Img * D_y $$ $$G_m = \sqrt{(\frac{\partial Img}{\partial x})^2 + (\frac{\partial Img}{\partial y})^2}$$

This gradient magnitude image is then binarized to produce the edge image.

Figure 1: Original image cameraman.png.

Figure 2: Partial $x$ derivative.

Figure 3: Partial $y$ derivative.

Figure 4: Gradient magnitude.

Figure 5: Binarized edge image with threshold $= 0.25$.

Derivative of Gaussian (DoG) Filter

While the finite difference operators are effective in detecting changes in pixel intensity, they can be sensitive to noise. To reduce noise, the function gaussian_blur was created to blur the image using a 2D Gaussian filter. By first blurring the image, and then applying the finite difference operators, smoother results were produced. For these images, the Gaussian kernel was created with radius $r=3$ and standard deviation $\sigma=1$.

Figure 6: Partial $x$ derivative
after Gaussian blur.

Figure 7: Partial $y$ derivative
after Gaussian blur.

Figure 8: Gradient magnitude after convolving with a Gaussian filter.

Figure 9: Binarized edge image with threshold $= 0.1$.

Compared to the initial result without filtering, the filtered edge image has smoother, clearer lines and fewer dots of noise.

Alternatively, because convolution is commutative, the same effect can also be achieved by using the derivative of the Gaussian (DoG) filter. To this end, the Gaussian kernel was convolved with $Dx$ and $Dy$ (Figure 10 and Figure 11), and then the image was convolved with $Dx$ and $Dy$ to show the partial derivatives.

Figure 10: $D_x$ after convolving with the Gaussian kernel.	Figure 11: $D_y$ after convolving with the Gaussian kernel.	Figure 12: Partial $x$ derivative using the derivative of the Gaussian.	Figure 13: Partial $y$ derivative using the derivative of the Gaussian.
Figure 14: Gradient magnitude after convolving with the derivative of the Gaussian filter.	Figure 15: Binarized edge image with threshold $= 0.097$.

Image Sharpening

In order to sharpen an image $f$, an unsharp mask filter can be used. By convolving with a Gaussian filter, the image is blurred to retain the image's low frequencies. Thus, to extract the high frequencies, the blurred image is extracted from the original image. The high frequencies are then multiplied by the sharpening factor $\alpha$ and added back to the original image

$$f_{\text{sharp}} = f + \alpha (f - f * g) = (1 + \alpha) f - \alpha f * g = f * ((1 + \alpha)e - \alpha g)$$

$f =$ original image
$f_{\text{sharp}} =$ sharpened image
$g =$ Gaussian filter
$\alpha =$ sharpening factor that controls the strength of the high-frequency details added back to the original image
$e =$ unit impulse filter

For these images, the Gaussian kernel was created with radius $r=3$ and standard deviation $\sigma=1$.

Figure 16: Original image
taj.jpg.

Figure 17: Sharpened taj.jpg
with $\alpha=2$.

Figure 18: Sharpened taj.jpg
with $\alpha=5$.

Figure 19: Original image
montmartre.jpg.

Figure 20: Sharpened montmartre.jpg
with $\alpha=2$.

Figure 21: Sharpened montmartre.jpg
with $\alpha=5$.

Increasing $\alpha$ strengthens the high-frequency details in the sharpened image.

Re-sharpening a Blurred Image

The image of my cat Luna luna_small.jpg was used to test whether sharpening can fully restore the high frequency details of an image. The original image was first blurred by applying a Gaussian filter. Then, it was sharpened using the unsharp mask filter. As shown in the following images, although the details are stronger, the re-sharpened images lose certain subtleties seen in the original such as the gradients in the shadows and the fine lines in her fur.

Figure 22: Original image
luna_small.jpg.

Figure 23: Blurred image
luna_small_blurred.jpg.

Figure 24: Blurred then sharpened with $\alpha=2$.

Figure 25: Blurred then sharpened with $\alpha=5$.

Hybrid Images

High-pass and low-pass filters can be used to create hybrid images by combining the low-frequency components of one image, lo_im, with the high-frequency components of another, hi_im. When overlaid, hi_im is prominent when viewed up close, while the lower frequencies from lo_im dominate from a distance, as low-frequency features are more easily perceived at a longer distance.

Derek & Nutmeg

Figure 26: Original image
DerekPicture.jpg
$r = 30, \sigma=9$.

Figure 27: Original image
nutmeg.jpg
$r = 10, \sigma=3$.

Figure 28: Grayscale hybrid image of Derek & Nutmeg.

Figure 29: Colored hybrid image of Derek & Nutmeg.

Real-life Elana & Elana as a drawing

Figure 30: Elana at La Note
$r = 10, \sigma=9$.

Figure 31: Elana's self-portrait drawn using Procreate
$r = 10, \sigma=3$.

Figure 32: Grayscale hybrid image of Elana.

Figure 33: Colored hybrid image of Elana.

Real-life motorcycle & drawing of a motorcycle

Figure 34: Real 2011 BMW R1200GS
$r = 10, \sigma=9$.

Figure 35: Drawing of the 2011 BMW R1200GS (graphite on paper)
$r = 10, \sigma=3$.

Figure 36: Grayscale hybrid image of the motorcycle.

Figure 37: Colored hybrid image of the motorcycle.

Luna & Tiger (Failure Case)

Although Luna is a very cute cat, this hybrid image of her and a tiger did not turn out well. Given the input images, these were the best images that could be produced. In this case, the images could not be properly aligned, and the low frequency components overwhelmed the high frequency features. When adding the filtered images together, a weight was multiplied to the low frequency image to lessen its intensity, at the cost of darkening the resulting image.

Figure 38: Luna yawning
$r = 30, \sigma=15$.

Figure 39: Tiger yawning
$r = 10, \sigma=3$.

Figure 40: Grayscale hybrid image of the cats.

Figure 41: Colored hybrid image of the cats.

Prof. Efros & Prof. Ng

Figure 42: Computer vision professor Alexei Efros efros.jpg $r = 10, \sigma=9$.	Figure 43: Computer graphics professor Ren Ng yirenng.jpg $r = 10, \sigma=3$.	Figure 44: Grayscale hybrid image of the two professors.	Figure 45: Colored hybrid image of the two professors.
Figure 46: Frequency analysis of efros.jpg.	Figure 47: Frequency analysis of yirenng.jpg.	Figure 48: Frequency analysis of efros.jpg after low-pass filtering.	Figure 49: Frequency analysis of yirenng.jpg after high-pass filtering.
Figure 50: Frequency analysis of the professors' hybrid image.

Gaussian & Laplacian Stacks

Building off of the techniques to devise low-pass and high-pass filters, band-pass filters can be created to blend images smoothly. In order to blend one image, im1, with another, im2, Gaussian and Laplacian stacks are implemented. As opposed to a Gaussian pyramid, images in a Gaussian stack are not downsampled; instead, the Gaussian filter is successively applied at each level, preserving the image dimensions. Therefore, each successive level in the Gaussian stack stores increasingly low frequencies. The Gaussian stack is then used to build the Laplacian stack. Each level $i$ in the Laplacian stack $L$ is computed from images in the Gaussian stack $G$ as follows:

$$L_i = G_i - G_{i + 1}$$

This computation effectively applies a band-pass filter to the image, allowing important details to be isolated in each band. Having built Gaussian and Laplacian stacks for im1 and im2, the stacks can then be utilized for multiresolutional blending.

Given the input images apple.jpeg and orange.jpeg, the images are first converted to grayscale (by taking only one of the color channels) and then filtered to build the Gaussian and Laplacian stacks.

Figure 51: Original image apple.jpeg.	Figure 52: Grayscale apple.jpeg.
Figure 53: apple.jpeg at level 0 in the Laplacian stack.	Figure 54: apple.jpeg at level 4 in the Laplacian stack.	Figure 55: apple.jpeg at level 8 in the Laplacian stack.	Figure 56: apple.jpeg at level 12 in the Laplacian stack.

Figure 57: Original image orange.jpeg.	Figure 58: Grayscale orange.jpeg.
Figure 59: orange.jpeg at level 0 in the Laplacian stack.	Figure 60: orange.jpeg at level 4 in the Laplacian stack.	Figure 61: orange.jpeg at level 8 in the Laplacian stack.	Figure 62: orange.jpeg at level 12 in the Laplacian stack.

Multiresolutional Blending

To blend two images seamlessly, the multiresolution spline technique is used. By gently distorting im1 and im2, the images can then be joined together in a smooth seam called an image spline. At each band of image frequencies, multiresolutional blending computes this seam between the two images separately, allowing for the gradual blending of features.

First, a mask is created to control how the two images are combined, defining which areas of each image would be visible in the result. To blend apple.jpeg and orange.jpeg, a mask is used to divide the image in half vertically (Figure 65).

A Gaussian stack $M$ of the mask is created. To blend the images, the corresponding levels of the Laplacian stacks of both original images are combined as follows:

$$F_i = L^A_i \cdot M_i + L^B_i \cdot (1 - M_i) $$

$F_i =$ blended result of the two images at level $i$
$L^A_i =$ level $i$ in the Laplacian stack for im1
$L^B_i =$ level $i$ in the Laplacian stack for im2
$M_i =$ level $i$ in the Gaussian stack for the mask

Finally, the blended oraple is achieved:

Figure 63: Original image
apple.jpeg.

Figure 64: Original image
orange.jpeg.

Figure 65: Vertical mask
$r = 20, \sigma=10$.

Figure 67: Oraple
$r = 3, \sigma=1$.

None pizza with left beef

Figure 68: Original image
pizza.jpg.

Figure 69: Original image
none.jpg.

Figure 70: Vertical mask
$r = 20, \sigma=10$.

Figure 71: None pizza with left beef
$r = 3, \sigma=1$.

Jianguo of the cliffs

Figure 72: Original image
jianguo_couch.jpg.

Figure 73: Original image
cliff.jpg.

Figure 74: Irregular mask for Jianguo made using Photoshop
$r = 3, \sigma=1$.

Figure 75: Jianguo of the cliffs
$r = 3, \sigma=1$.

Lunazilla

Figure 76: Original image
luna_night.jpg.

Figure 77: Original image
city.jpg.

Figure 78: Irregular mask for Luna made using Photoshop
$r = 3, \sigma=1$.

Figure 79: Lunazilla
$r = 3, \sigma=1$.

Conclusion

I greatly enjoyed working on this project and gained valuable insights, particularly in how the manipulation of image frequencies can be creatively applied in visual art.