CS5187 Vision and Image
Time allowed : Two hours
视觉和图像考试助攻 1.This paper consists of 16 questions. 2.Answer ALL questions. This is an open-book examination. Students are allowed to use the following
1.This paper consists of 16 questions.
2.Answer ALL questions.
This is an open-book examination.
Students are allowed to use the following materials/aids:
Materials/aids other than those stated above are not permitted. Students will be subject to disciplinary action if any unauthorized materials or aids are found on them.
Academic Honesty 视觉和图像考试助攻
I pledge that the answers in this examination are my own and that I will not seek or obtain an unfair advantage in producing these answers. Specifically,
❖ I will not plagiarize (copy without citation) from any source;
❖ I will not communicate or attempt to communicate with any other person during the examination; neither will I give or attempt to give assistance to another student taking the examination; and
❖ I will use only approved devices (e.g., calculators) and/or approved device models.
❖ I understand that any act of academic dishonesty can lead to disciplinary action.
I pledge to follow the Rules on Academic Honesty and understand that violations may lead to severe penalties.
Answer ALL the questions below
Section A (short questions) 视觉和图像考试助攻
Five marks for each question (5%)
Q1: Please describe two image processing methods that require the image interpolation.
Q2: Please explain how SIFT feature could achieve rotation invariance.
Q3: Assume we do not change the physical size of the virtual image plane, and we reduce the number of pixels to one fourth (1/4) in acquisition of the photos. In particular, the number of pixels in both horizontal and vertical directions are reduced by half. Please explain whether the camera intrinsic parameters that project a scene point in the camera coordinate system to a pixel in the image plane will be changed. Please give specific explanations.
Q5: Please show that the combination of scale and translation is still affine transformation.
Q7: Please explain how we can enhance the sharpness of an image.
Q8: What is the minimal number of pixels to measure the optical flow using the Lucas-Kanade equation.
Q9: Assume that in homogeneous coordinate (x, y, w), when w = 0, the point is infinitely far. Show that the affine transformation of an infinity point is still at infinity.
Q10: Regarding the structure from motion, assuming that the input includes 30 images, and 2000 scene points need to be reconstructed, please describe the unknown variables that need to be solved for structure from motion.
Q11: What is the output with the following inputs and functions:
1) ReLU with the input as 5;
2) Average pooling with inputs as 4, 6, 6, 8.
Q12: The following question is regarding the in-network upsampling. Given the max-pooling as follows, please use the corresponding position in the pooling layer to fill in the results of the table with max unpooling. After identifying the positions for unpooling, other positions are padding with zeros.
The max pooling process is as follows:
Section B (long questions) 视觉和图像考试助攻
(a) Please explain why the stereo image rectification is performed for the inference of the depth map. (2 marks)
(b) After image rectification, where are the epipoles? (2 marks)
(a) Please compute the LBP for the central pixel p given the pixel value and 3×3 neighbouring pixel values. In particular, please create the 8-bit number b1b2b3b4b5b6b7b8, where bi = 0 if neighbouring i has value less than or equal to p’s value and 1 otherwise. The upper left has the index 1 (first), and the order is in clock-wise. (3 marks)
(b) When the neighbouring pixels are rotated as follows, please create the 8-bit number LBP again, following the rule in (a). (2 marks)
(c) The current LBP is not robust to rotation. Please design an algorithm based upon LBP, making the LBP robust to rotation variations. Please explain in detail how the algorithm works. (5 marks)
(a) Please elaborate how the Lucas-Kanada improves the performance of the optical flow estimation when the small motion assumption is violated. (3 marks)
(b) Let S(x,y,t) , denote an image sequence, and assume that there is an affine change in intensities due to the illumination variations for neighbouring frames,
S(x+u,y+v,t+1) = α·S(x,y,t) +β
Herein, (u,v) is the motion vector. α and β, which depend on the location (x,y) are photometric parameters that change the pixel intensity.
First, based upon the first-order Taylor expansion of S(x+u,y+v,t+1), please show the linear system of the equations for the estimation of the unknown parameters (u,v,α,β). (5 marks)
Second, please indicate the minimal window size to estimate these parameters, based upon the assumption of spatial coherence. (2 marks)
(a) In the alignment, assume that the transformation is only translation. Given three matches in I and I′,
(x1,y1) = (5, 10), (x’1,y’1) = (7, 12)
(x2,y2) = (5, 10), (x’2,y’2) = (8, 23)
(x3,y3) = (5, 10), (x’3,y’3) = (32, 26)
Please provide the translation vector (from I to I′) that will lead to the least square solution on the sum of square residuals. Herein, the residual is defined as the difference between the estimated position based on the derived translation vector and the true position for (x’1,y’1), (x’2,y’2), and (x’3,y’3). (4 marks)
(b) Assuming that the transformation is affine transformation, and we need at least 3 matches to obtain the parameters in affine transformation. For two images, we find that there are 1000 matches, within which 50% points are inliers. Assuming the number of iterations is 2 in RANSAC, please estimate the probability that we could obtain the correct transformation. Show the detailed steps. (6 marks)