Fall 2024 CS 180 Project1

Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Church
Example Image: Church
Displacement: green:(0, 25), red:(-4, 58)

In this project, I implemented an image alignment technique using an image pyramid approach combined with Normalized Cross-Correlation (NCC) as the evaluation matrix. The algorithm employs a recursive approach where each level of the pyramid is sampled from the previous level, reducing the image size to one-quarter (half the width and half the height) of the previous layer.

The alignment process begins at the coarsest level of the pyramid, where the images are significantly downscaled and only have less then 200*200pi. In this level, we can do matching in a 31*31 square but only take a little time. NCC is used at each level to determine the best matching vector that aligns the two images. Once the optimal vector is found, it is passed up to the next finer level and then doubled. At each successive level, the algorithm refines the search by limiting the region of interest to a rectangular window centered around the vector passed from the previous layer, with a search radius of ±r( I used 4). This recursive refinement continues until the finest level is reached, where the final alignment vector is determined. On average, each images were processed in about 28 seconds, including two alignments to different color channels. This algorithm ensures a balance between computational efficiency and alignment precision.

Project Outcome

The recursive alignment algorithm using the image pyramid and NCC was successfully implemented and tested on a diverse set of images. Most images in the dataset were successfully aligned, demonstrating the robustness of this method. However, one image (emir.tif) posed a challenge where the alignment did not converge as expected.

Overall, the project demonstrated that the recursive image pyramid approach combined with NCC provides a powerful and efficient solution for automatic image alignment. It worked effectively in most cases and offered opportunities for refinement in more challenging scenarios.

Step Further

Apply Feature Detection

edge_detection
The result of edge detection at each layer during pyramid processing
(left:red channel; right: blue channel)

While the image alignment method using Normalized Cross-Correlation (NCC) worked well for most images, “emir” image presented a challenge due to its unique characteristics. The main subject's clothing featured in a pure blue color with intricate patterns. This resulted in significant differences between the red and blue channels. When relying solely on NCC to compare brightness, these two parts appeared visually opposite, making alignment difficult.

To address this issue, I shifted my focus from brightness-based alignment to feature-based alignment. I introduced feature extraction techniques to handle such cases, particularly focusing on aligning the red and blue channels based on image features rather than just intensity values. For this purpose, I incorporated the cv2.Canny function to extract edges in the image.

Initially, I applied edge extraction to the entire image, but the results were unsatisfactory due to the large size and complexity of the image. The edge detection was not precise enough to yield meaningful alignment results. To improve the performance, I integrated edge extraction into each layer of the image pyramid. During the visualization of the pyramid processing, I noticed that edge extraction started to have a positive impact on the alignment, especially in capturing the key features of the main subject.

However, I observed that edges from the surrounding environment were affecting the results. To address this, I adjusted the two thresholds in the Canny edge detection function and cropped parts of the surrounding environment to focus the algorithm on the main subject. This adjustment allowed the algorithm to concentrate more on the key features of the subject, reducing the impact of irrelevant background details.

Results and Insights

Although this feature-based approach did not result in a perfect solution, it did show some promise. The red and blue channels were better aligned, demonstrating that edge-based feature extraction is a viable direction for further exploration. However, when applying this method to other images, the results were inconsistent, suggesting that this edge-based alignment algorithm still requires significant refinement to achieve reliable performance across different images.

edge_emir
Emir with application of edge detection
Displacement: green:(24, 49), red:(281, 278)
Emir
Emir with original NCC
Displacement: green:(24, 49), red:(-200, 0)

White Balance Adjustment

To implement white balance adjustments, I chose both the minimum and maximum values of the different color channels and scaled them to form a uniform level and achieve balance in the overall image.

First I obtain the maximum and minimum values for each channel and determine the maximum of the current minimum value and the minimum of the maximum value for all channels as the target range. Then calculate the scaling and offset factors to linearly scale the values of each channel to the target range.

By applying this white balance adjustment, It's able to produce a final output with more visually appealing and balanced colors.

Church
Original Church
Church appling WBA
Church after Appling White Balance Adjustment
Lady
Original Lady
Lady appling WBA
Lady after Appling White Balance Adjustment