This is a series of class projects I wrote for the UC Berkeley class CS 194-26 (now called CS 180). These projects took a lot of effort and time -- reimplementing ideas from papers, taking a lot of time to debug, and just generally intensive. Below are links to some of them, and a link to an overview of what the project consisted of.
Sergey Prokudin-Gorskii went around the Russian countryside in the early 20th century taking photographs onto color filtered plates. While he did not live to see the negatives aligned, we can align them digitally using sum-of-squared differences, aligning at increasing image resolution to reduce computation time via "image pyramids" -- generating approximations at lower resolution and then increasing the resolution to fine-tune alignment.
The basics of signal processing also apply to images, which are just multi-dimensional signals. Low-pass-filtering an image blurs it, as finer details such as edges and other rapid color changes are higher frequency. This project explores this concept on several images, as well as blending at different frequency levels to merge images together.
The image looks like Ranade from afar but one can see the Sahai when one looks up close.
By using a common set of annotated keypoints on human faces, one can construct a set of corresponding triangles that can morph from one face to another, using a weighted average of the triangle colors to produce an animation similar to those morph effects seen in 1990's era CGI.
In the first half of this project, we first use known keypoints to merge images together via homographies to create panoramas. In the second half of this project, we automatically find corresponding keypoints and filter out a useful subset with adaptive non-maximal suppression and RANSAC.
Keeping up with the idea of automatically detecting keypoints, we train a convolutional neural network using PyTorch to detect a set of facial keypoints.
Using an origami cube, we can easily label and track keypoints using ginput and CSRT. With these keypoints and homographies, we can project a rudimentary cube into the image.
We select a back wall and a vanishing point in an image, and use this information to project a 3d model of a room from a flat image.
From this 3d model, I used pyrender and skvideo to create 3d dynamic camera tours of the constructed room. I had no prior experience with 3d rendering so this had a steep learning curve (i forgot that positive Y is actually up in many graphics systems, for example)