Homework 4: Blending

Introduction

In this homework assignment, we will be putting together a pyramid blending pipeline that will allow us to turn the following three images into a blended image (This assignment will require you to create a blend using your own images!). This assignment is to be done individually. You may ask others for help on Ed Discussions, but you may NOT use other people's code outside of the code we have provided to you.

Consider the following two image and a 0/1 vertical mask:

This is the output of blending the above two images with the mask.

Before we get started, Download the assignment files under Canvas > Files > Homeworks that contains the files you will be working with.

Once you have extracted the files, you are ready to get started.

Above you can see the sample images which are provided for you in the images/source/sample directory. Every time you run the code, it will look inside each folder under images/, and attempt to find a folder that has images with filenames that contain 'white', 'black' and 'mask'. Once it finds a folder that contains three images with those respective names, it will apply a blending procedure to them, and save the output to images/output/. Along with the output image, it will create visualizations of the gaussian and laplacian pyramids used in the process.

The blending procedure takes two images and the mask and splits them into their red, green, and blue channels. It then blends each channel separately. You do not have to worry about dealing with three channels, you can assume your images take in grayscale images (as we have always done).

The code will construct a laplacian pyramid for the two images. It will then scale the mask image to the range [0,1] and construct a Gaussian pyramid for it. Finally, it will blend the two pyramids and collapse them to the output image.

Pixel values of 255 in the mask image are scaled to the value 1 in the mask pyramid, and assigned the strongest weight to the image labeled 'white' during blending. Pixel values of 0 in the mask image are scaled to the value 0 in the mask pyramid, and assigned the strongest weight to the image labeled 'black' during blending.

In order to facilitate this process, you will be providing six (6) key functions throughout this programming assignment that are the building blocks of blending two images.

Part 0: Reduce and Expand functions.

See Module 04-03 on Pyramids. Alternatively, you can refer to the original Burt & Adelson Paper: The Laplacian Pyramid as a Compact Image Code

As with previous assignments, running assignment4_test.py directly will apply a unit test to your code and print out helpful feedback. You can use this to debug your functions.

reduce

This function takes an image, convolves it, and then subsamples it down to a quarter of the size (dividing the height and width by two). Note that we say subsample here. We recommend you look into numpy indexing techniques to accomplish the subsampling (essentially you want to index every other row and every other column).

Within the code, you are provided with a generating_kernel(a) function. This function takes a floating point number a, and returns a 5x5 generating kernel. For the reduce and expand functions, you should use a = 0.4. This parameter will generate a kernel that is a close aproximation of a Gaussian.

Seeing as we have already covered how convolve works in class, for this assignment we allow you to use the scipy library implementation of convolve.

scipy.signal.convolve2d(image, kernel, 'same')

This call will convolve the image and kernel, and return an array of the 'same' size as image.

Alternatively, as our kernel is symetrical, you may instead use the cv2.filter2d function, as long as you set the border mode to constant.

expand

This function takes an image and supersamples it to four times the area (multiplying the height and width by two (2)). After increasing the size, we have to interpolate the missing values by using the same convolution we used in reduce, and lastly we scale our output by 4.

Some tips in terms of how to super sample an image.

Create an image that is twice the size of the input.
Review your reduce code. Look at what you did to get your output.
Instead of choosing every other row / col in reduce for your output, can you assign every other row / col in expand for your output?
As stressed above, look into numpy indexing. A key thing to note is the basic slice syntax, think about how you can use that to your advantage.

For this part of the assignment, please use the generating kernel with a = 0.4 and the convolve2d function from the reduce function discussion above.

Part 1: Gaussian and Laplacian Pyramids

In this part of the assignment, you will be implementing functions that create Gaussian and Laplacian pyramids. As usual, use assignment4_test.py to test your code. In addition, assignment4_test.py defines the functions viz_gauss_pyramid and viz_lapl_pyramid, which take a pyramid as input and return an image visualizaiton. You might want to use these functions to visualize your pyramids while debugging -- we use these at the end of the testing to output your results!

gaussPyramid

This function takes an image and builds a pyramid out of it. The first layer of this pyramid is the original image, and each subsequent layer of the pyramid is the reduced form of the previous layer. Put simply, you are iteratively calling the reduce function on the output of the previous call, with the first call simply being the input image.

Please use the reduce function that you implemented in the previous part in order to write this function.

laplPyramid

This function takes a Gaussian pyramid constructed by the previous function, and turns it into a Laplacian pyramid. The doc string contains further information about the operations you should perform for each layer.

Like with Gaussian pyramids, Laplacian pyramids are represented as lists of numpy arrays in the code.

Please use the expand function that you implemented in the previous part of the code in order to write this function.

Part 2: Writing the blend and collapse functions.

In this part, you will be completing the pipeline by writing the actual blend function, and creating a collapse function that will allow us to convert our Laplacian pyramid into an output image.

As always, you can use assignment4_test.py to test your code.

blend

This function takes three pyramids:

white - a laplacian pyramid of an image
black - a laplacian pyramid of another imge.
mask - a gaussian pyramid of a mask image.

It should perform an alpha-blend of the two Laplacian pyramids according to the mask pyramid. So you will be blending each pair of layers together using the mask of that layer as the weight.

As described in the doc string, pixels where the mask is 1 should be taken from the white image, pixels where the mask is 0 should be taken from the black image. Pixels with value 0.5 in the mask should be an equal blend of the white and black images. This is mathematically described in the assignment comments.

You may assume that all of the provided pyramids are of the same dimensons, and have dtype float. You may further assume that the mask pyramid has values in the range [0,1].

Your output pyramid should be of the same dimensions as all the inputs, and dtype float.

collapse

This function is given a laplacian pyramid and is expected to 'flatten' it to an image.

We need to take the bottom (smallest) layer, expand it, and then add it to the next layer. We continue this process until we reach the top of the pyramid. Once you add the second layer to the top layer, the top layer is your output (a common mistake is to re-add the top layer to the top layer so if your values look like they are twice as big as what the expected output is, that is what you are doing).

This function should return a numpy array of the same shape as the top (largest) layer of the pyramid, and dtype float.

The Writeup

This is what we want you to do for the PDF.

Choose two images (black/white) image that you would like to blend.
Create a unique mask (don't use the one we provide) and explain how you created it.
Demonstrate these three images in the PDF.
Explain what you did to tackle each function. Note: For the expand function, we also want you to explicitly state why you have to scale your output by 4.
Anything else you wish to note, feel free to include the pyramid outputs if you found them interesting, etc

What to turn in:

assignment4.py - Your code.
assignment4.pdf - See above for the writeup, don't forget to include your images!

How we will grade your submission

Part 0:
- reduce: 10%
- expand: 10%
Part 1:
- gaussPyramid -> laplPyramid: 30%
Part 2:
- - blend and collapse: 40%
  - finished image correctly blended
  - if no actual blend occurs 40 points will be deducted
Writeup: 10%
- Why'd you choose your images?
- How'd you make your mask?
- For each function explain what you're doing