CSC 320 - Visual Computing Assignment 4
In this assignment you will implement the PatchMatch algorithm. This algorithm is described in a paper by Barnes et al. and will be discussed in tutorial this week. Your specific task is to complete the technique’s implementation in the starter code. The starter code is based on OpenCV and is supposed to be executed from the command line, not via a Kivy GUI.
Goals: The goals of the assignment are to (1) get you familiar with reading and understanding a research paper and (partially) implementing the technique it describes; (2) learn how to implement a more advanced algorithm for efficient image processing; and (3) understand how the inefficiencies you experienced in matching patches in the previous assignment can be overcome with a randomized technique. This algorithm has since become the workhorse of a number of image manipulation operations and was a key component of Adobe’s Content-Aware Fill feature.
Bonus components: I am not including any explicit bonus components. There are a number of possibilities for extending your implementation, however, that I can discuss with interested students on a one-on-one basis. This includes an iOS/Android implementation, a GPU implementation, or even implementation of editing tools described in Barnes et al’s paper. Please email Kyros if interested. Bonus points will be awarded depending on complexity of the implementation, etc.
Important: As in the previous assignments, you are advised to start immediately by reading the paper (see below). The next step is to run the starter code, and compare it to the output of the reference implementation. Unlike Assignment 3 where you implemented a couple of small functions called by an already-implemented algorithmic main loop, here your task will be to implement the algorithm itself. This requires a much more detailed understanding of the algorithm as well as pitfalls that can affect correctness, efficiency, etc. Expect to spend a fair amount of time implementing after you’ve understood what you have to do.
Testing your implementation: Unlike Assignment 3, it is possible to develop and run your code remotely from Teaching Lab machines.
Starter code & the reference solution
Use the following sequence of commands to unpack the starter code and to display its input arguments:
> cd ~
> tar xvfz patchmatch.tar.gz
> rm patchmatch.tar.gz
> cd patchmatch/code
> python viscomp.py --help
Consult the file patchmatch/code/README 1st.txt for details on how the code is structured and for guidelines about how to navigate it. Its structure should be familiar by now as it is similar to previous assignments. This code was lasted tested and compiled on the teach.cs server using python=3.11. In addition to the starter code, I am providing the output of the reference solution for a pair of test images, along with input parameters and timings.
I am also providing a fully-featured reference solution in an encrypted format (with sourcedefender) to see how your own implementation should behave, and to make sure that your implementation produces the correct output. That being said, you should not expect your implementation to produce exactly the same output as the reference solution as tiny differences in implementation might lead to slightly different results. This is not a concern, however, and the TAs will be looking at your code as well as its output to make sure what you are doing is reasonable.
As I have explained in previous assignments, you should not expect your implementation to produce exactly the same output as the reference solution; tiny differences in implementation might lead to slightly different results. This is not a concern, however, and the TAs will be looking at your code as well as its output to make sure what you are doing is reasonable.
patchmatch/CHECKLIST.txt: Please read this form carefully. You will need to complete this form
prior to submission, and this will be the first file markers look at when grading your assignment.
The PatchMatch Algorithm (100 Marks)
The technique is described in full detail in the following paper (available here):
C. Barnes, E. Shechtman, A. Finkelstein and D. B. Goldman, “PatchMatch: A Randomized Algo-
rithm for Structural Image Editing,” Proc. SIGGRAPH 2009.
You should read Section 1 of the paper right away to get a general idea of the principles behind the method. The problem it tries to address should be familiar to you given that the algorithm you worked with in A3 relied on a “nearest-neighbor search” procedure for identifying similar patches for inpainting. In fact, Criminisi et al’s inpainting algorithm is cited in Section 2 of the paper as a motivation for PatchMatch. You should read Section 2 as well, mainly for context and background.
The algorithm you are asked to implement is described in full in Section 3. The algorithm’s ini- tialization, described in Section 3.1, has already been implemented. Your task is to implement the algorithm’s basic iteration as described in Section 3.2 up to, but not including, paragraph Halting criteria. The starter code uses the terminology of Eq. (1) to make it easier for you to follow along.
For those of you who are more theoretically minded and/or have an interest in computer science theory, it is worth reading Section 3.3. This section is not required for implementation but it does help explain why the algorithm works as well as it does.
Sections 4 and 5 of the paper describe more advanced editing tools that use PatchMatch as a key
component. They are not required for your implementation, and Section 4 in particular requires a
fair amount of background that you currently don’t have. Read these sections if you are interested
in finding out all the cool things that you can do with PatchMatch.
Part 1. Programming Component (90 Marks)
You need to complete to implement the two functions detailed below. A skeleton of both is included in file patchmatch/code/algorithm.py. This file is where your entire implementation will reside.
In addition to these functions, you will need to copy a few lines of code from your A3 imple-
mentation for image reading and writing that are not provided in the starter code. See the file patchmatch/code/README 1st.txt for details.
Part 1.1. The propagation and random search() function (65 Marks)
This function takes as input a source image, a target image, and a nearest-neighbor field f that assigns to each patch in the source the best-matching patch in the target. This field is initially quite poor, i.e., the target patch it assigns to each source patch is definitely not the most similar patch in the target. The goal of the function is to return a new nearest-neighbor field that improves these patch-to-patch correspondences. The function accepts a number of additional parameters that control the algorithm’s behavior. Details about them can be found in the starter code and in the paper itself.
As explained in the paper, the algorithm involves two interleaved procedures, one called random search (50 marks) and the other called propagation (15 marks). You must implement both, within the same function. You are welcome to use helper functions in your implementation but this is not necessary (the reference implementation does not).
The starter code provides two flags that allow you to disable propagation or random search in this
function. As you develop your implementation, you can use these flags for debugging purposes, to
isolate problems related to one or the other procedure.
Part 1.2. The reconstruct source from target() function (15 Marks)
This function re-creates the source image by copying pixels from the target image, as prescribed by the supplied nearest-neighbor field. If this field is of high quality, then the copied pixels will be almost identical to those of the source; if not, the reconstructed source image will contain artifacts. Thus, comparing the reconstructed source to the original source gives you an idea of how good a nearest-neighbor field is.
Details of the function’s input and output parameters are in the starter code.
Part 1.3. Efficiency considerations (10 Marks)
You should pay attention to the efficiency of the code you write. Explicit loops cannot be completely avoided in propagation and random search() but their use should be kept to an absolute minimum. This is necessary to keep the running time of your code to a reasonable level: the input images you are supplied are quite large and it will take a very long time to process them if you use too many loops. No explicit loops are needed in reconstruct source from target().
Solutions that are no more than 50% slower than the reference implementation will receive full marks
for efficiency. Less efficient implementations will have some of those points deducted depending on
how much they deviate from this baseline.
Part 2. Report and Experimental evaluation (10 Marks)
Your task here is to put the PatchMatch to the test by conducting your own experiments. Try it on a variety of pairs of photos; on two adjacent frames of video; on “stereo” image pairs taken by capturing a photo of a static scene and then adjusting your viewpoint slightly (e.g., a few centimeters) to capture a second photo from a different point of view. Basically, run it on enough image pairs to understand when it works well and when it doesn’t.
At the very least, you must show the results of running the algorithm on the supplied source/targetimage pairs, using command-line arguments like those specified in the file test images/jaguar2/README.txt.
咨询 Alpha 小助手,获取更多课业帮助