Fall 2024 CS 180 Project2

Fun with Filters and Frequencies!

By Yiyu Chen

In this project, I applied various image processing techniques, starting with the exploration of 2D convolutions through finite difference operators to enhance edge detection in images. I then used Gaussian filters to smooth images before subtracting these from the original to sharpen image details. Further, I experimented with creating hybrid images that appear differently at varying distances and concluded with multi-resolution blending to integrate different images seamlessly. This report presents the methods I employed, the results obtained, and the insights I gained from these experiments.

Part 1: Fun with Filters

In this part, I employed the \(D_x\) and\(D_x\) finite difference operators, implemented as convolution kernels, to detect edges in images by highlighting changes in pixel intensity along the horizontal and vertical axes, respectively. These operators are designed to approximate the first derivatives in each direction, crucial for detecting the boundaries where pixel intensities change significantly.

\(D_x\) kernel: This kernel, typically represented as \([−1,1]\), is applied horizontally across the image. It computes the difference in intensity between adjacent pixels along the x-direction. When convolved with the image, this kernel emphasizes edges that are predominantly vertical by highlighting horizontal gradients.
\(D_y\) kernel: Similarly, this kernel, usually displayed as \( \begin{bmatrix} 1 \\ -1 \end{bmatrix} \) , is applied vertically. It measures the intensity difference between adjacent pixels along the y-direction, thus accentuating edges that are primarily horizontal by capturing vertical gradients.

Why this method is effective for edge detection: This method is effective for edge detection because it directly targets changes in brightness, which define edges within an image. The separate use of horizontal and vertical kernels allows for the detection of edges in both orientations, making the approach robust and versatile for various types of images. By enhancing areas where there is a significant shift in adjacent pixel values, this technique efficiently delineates the boundaries and shapes within the images, crucial for tasks such as object recognition and scene understanding.

cameraman after gradient — The image after applying gradient

However, we can find that the result was quite noisy. It looks like a collection of white dots, rather than the edges we want. To solve this problem I applied a Gaussian filter.

cameraman_blur — The image after Gaussian blur

cameraman_blur_gradient — The image applied both Gaussian and gradient

When using the Derivative of Gaussian (DoG) filter compared to the simple finite difference operator, the most significant difference observed is the reduction of dots. The Gaussian filter serves as a smoothing operator that mitigates the high-frequency noise components before the derivative operation is applied. This results in edges that are smoother and less fragmented than those produced by the mere application of the finite difference operators. The DoG approach effectively captures important edge details while suppressing unnecessary noise, making it superior for applications where clean edge detection is crucial.

After this, instead of doing three consecutive convolutions, I applied a single convolution operation with a kernel that is the convolution of the three kernels. Since images are generally larger than kernels, convolution operations on images take more time. This change can improve efficiency. Mathematically, the associative property of convolution ensures that the order of operations does not affect the final result, meaning that convolving the image first with a Gaussian and then with a derivative filter is equivalent to convolving it directly with the combined DoG filter. This method simplifies the process while maintaining accuracy in edge detection. I verified this by visually comparing the results, confirming that the DoG approach produced outcomes consistent with those obtained through the sequential application of Gaussian and derivative filters.

Exploring the Effects of \(D_x\) and\(D_x\) Kernels:

\(D_x\) kernel:The kernel emphasizes changes in intensity along the horizontal axis, making it effective for detecting vertical edges where there is a significant horizontal gradient in pixel values. When applied to an image, this kernel highlights vertical structures, as it responds strongly to vertical lines that differ sharply in brightness from their surroundings.
\(D_y\) kernel:Conversely, this kernel detects changes along the vertical axis, highlighting horizontal edges. This kernel is sensitive to horizontal discontinuities in brightness, making it ideal for underlining horizontal structures in the image.

cameraman_Dx — The image only applied \(D_x\)

cameraman_Dy — The image only applied \(D_y\)

The core functionality of these kernels can be likened to recognizing patterns in the image that match their structure. \(D_x\) and\(D_x\) act as templates that respond to specific orientations of edges: vertical and horizontal, respectively. By convolving these kernels with the image, areas that closely resemble the kernel's pattern produce higher values, effectively identifying edges that match the orientation of the kernel.

More Examples:

elevator_gradient — The image only applied gradient

elevator_Dx_Dy — The image applied both \(D_x\) and \(D_y\)

elevator_Dx — The image only applied \(D_x\)

elevator_Dy — The image only applied \(D_y\)

Part 2: Fun with Frequencies

Part 2.1: Image "Sharpening"

In this section, I implemented the unsharp masking technique to enhance image sharpness. This process involves subtracting a Gaussian blurred version of the image from the original to emphasize high-frequency details, effectively making the image appear sharper. The result is a visually sharper image that highlights textures and edges

taj with Gaussian Kernel — Sharpen with Gaussian Kernel (Sigma = 1)

For the evaluation, I selected a sharply detailed image of the ceiling of the Paris Opera House, applied a Gaussian blur to simulate a loss of sharpness, and then used the unsharp masking technique to attempt to restore its clarity.

Original vs. Sharpened Image: The sharpened image shows enhanced edge definition and texture clarity compared to the blurred version. However, it does not fully regain its original sharpness which was particularly evident in the delicate and finely detailed areas of the image. Also, the sharpening process introduced some artifacts and amplified noise.

ceiling-blur-sharp — Sharpen Blurred Image

Part 2.2: Hybrid Images

In this section of the project, I explore the fascinating concept of hybrid images that change in perception based on viewing distance. Hybrid images combine the high-frequency details of one image with the low-frequency components of another, resulting in a composite that offers dual interpretations. Due to the uneven distribution of human photoreceptor cells in the retina, when viewing closely, the viewer perceives a high-frequency component, but from a distance, the low-frequency component dominates the visual experience. The creation process involves careful frequency separation and blending. It uses a low-pass filter to smooth one image, a high-pass filter to accentuate the details in another, and then combine them.

Mona Lisa & Girl with a Pearl Earring

The leftmost image is zoomed out to simulate viewing from a distance.

I created a hybrid image using two iconic artworks: "Mona Lisa," which was subjected to a low-pass filter to retain its low-frequency components, and "Girl with a Pearl Earring," from which high-frequency details were extracted. FFT (Fast Fourier Transform) images provide a visual representation of the frequency content of images in the spatial domain converted to the frequency domain. Here are the insights gained from FFT analysis at various stages of image processing.

FFT of Input Image 1 & 2: Since both paintings are images of oil paintings, the FFTs look very similar. They display a typical distribution of frequencies with a bright center with frequencies spreading radially outward. It indicates a strong presence of low frequencies typical in smoother areas and a mix of higher frequencies corresponding to the finer details and textures of the image.
FFT of Filtered Image 1 (The Girl with a Pearl Earring - High Pass Filtered): The third FFT shows a significant presence of high frequencies compared to the one above it, indicative of the removal of low-frequency components.
FFT of Filtered Image 2 (Mona Lisa - Low Pass Filtered): Contrary to the third image, this FFT displays a pronounced central brightness with diminished radial spread. It indicates that the low-pass filter has effectively retained the low-frequency content, smoothing out the higher frequencies.
FFT of Hybrid Image: The final FFT image represents the direct combination of the low and high-frequency components from the filtered images.

Color Image

I tried to convert processing gray images to combing color ones and found that the key challenge lies in managing the image's frequency components effectively due to the richer information conveyed by color. Specifically, the Gaussian blur's sigma parameter needs careful adjustment. A higher sigma value is often required to reduce high-frequency details that can become overly pronounced and disrupt the visual coherence in the colorized image. This adjustment ensures that the addition of color enhances rather than obscures the image's underlying structures, maintaining clarity while providing a smooth transition between color tones. Also, the images that retain high-frequency details often have less color information but more edges. Since they are gotten by subtraction, it also requires a higher sigma value to get more information.

At the same time, the image selection is also crucial. My initial idea was to keep the low-frequency part of Girl with a Pearl Earring and filter out the high-frequency part of Mona Lisa. Because I think the blue part of Girl with a Pearl Earring is very distinctive. However, after trying, I found that this blue greatly affects the visual effect of the high-frequency part. In addition, the high contrast in brightness between the characters and the background in the picture also has a great impact. Even if I convert Girl with a Pearl Earring into a gray picture, the effect of the mixed picture is still not as good as when the Mona Lisa is used as the low-frequency part. On the other hand, the high-frequency information of Mona Lisa is relatively bland and less prominent, which is also an important factor.

The two images below are the best results I got after trying many different cutoff-frequency combinations, but still not very good results in the high-frequency part (Mona Lisa).