Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
research:hflow:start [2015/08/12 14:21] aneufeld |
research:hflow:start [2015/10/08 14:00] aneufeld [Estimating Vehicle Ego-Motion and Piecewise Planar Scene Structure from Optical Flow in a Continuous Framework] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Monocular 3D Reconstruction of Traffic Scenes from Optical Flow ====== | + | ====== Estimating Vehicle Ego-Motion and Piecewise Planar Scene Structure from Optical Flow in a Continuous Framework ====== |
+ | |||
+ | {{:research:hflow:neufeld_gcpr2015.pdf|GCPR 2015 slides}} | ||
We estimate a 3D scene structure and the vehicle egomotion given the forward and backward optical flow fields between two consecutive frames. | We estimate a 3D scene structure and the vehicle egomotion given the forward and backward optical flow fields between two consecutive frames. | ||
Line 17: | Line 19: | ||
Given rotation $R \in \text{SO}(3)$, translation $t \in S(2)$ and 3D point $X$, the point $X'$ relative to the second camera is given by $X' = R^\top (X - t)$. The translation is restricted to the unit sphere, since the translation norm cannot be estimated without additional information. | Given rotation $R \in \text{SO}(3)$, translation $t \in S(2)$ and 3D point $X$, the point $X'$ relative to the second camera is given by $X' = R^\top (X - t)$. The translation is restricted to the unit sphere, since the translation norm cannot be estimated without additional information. | ||
Let $\pi$ denote the projection onto the image plane, | Let $\pi$ denote the projection onto the image plane, | ||
- | $$ \pi\begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \frac{1}{x_3} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix}. $$ | + | $$ \pi\begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \frac{1}{x_3} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}. $$ |
Let $z(x)$ denote the depth of pixel $x$ in the image plane, $X = z(x) x$. Depth can be calculated from the plane parameters, $z(x,v) = (v^\top x)^{-1}$. | Let $z(x)$ denote the depth of pixel $x$ in the image plane, $X = z(x) x$. Depth can be calculated from the plane parameters, $z(x,v) = (v^\top x)^{-1}$. | ||
Optical flow is given by | Optical flow is given by | ||
Line 54: | Line 56: | ||
The proposed energy function is non-convex in rotation $R \in \text{SO}(3)$, $t \in S(2)$ and planes $v_i \in \mathbb{R}^3$. We choose the Levenberg-Marquardt algorithm for optimization. Note that the energy can be written in form | The proposed energy function is non-convex in rotation $R \in \text{SO}(3)$, $t \in S(2)$ and planes $v_i \in \mathbb{R}^3$. We choose the Levenberg-Marquardt algorithm for optimization. Note that the energy can be written in form | ||
- | $$E(R,t,v) = \sum_j \left( f_j(R,t,v) \right)^2 = \lVert f(R,t,v) \rVert_2^2.$$ | + | $$E(R,t,v) = \sum_j \left( f_j(R,t,v) \right)^2 = \lVert f(R,t,v) \rVert_2^2,$$ |
+ | which can be optimized using Levenberg-Marquardt. We apply a damping factor, which is updated after each iteration. | ||
===== Results ===== | ===== Results ===== | ||
- | We demonstrate the output of our algorithms on a few examples here, the paper contains a quantitative evaluation. From top to bottom, each example shows the reference frame, estimated depth and estimated normals. Depth is encoded as follows, the scale is derived from stereo data. | + | We demonstrate the output of our algorithms on a few examples here, the paper contains a quantitative evaluation. From top to bottom, each example shows the reference frame, estimated depth and normals and the flow error as measured in the paper (click for a full size view). Depth and depth error (in pixels) are encoded as follows, the scale is derived from ground truth data. |
- | {{ :research:hflow:depth.png?nolink&200 |}} | + | {{ :research:hflow:depth2.png?nolink&200 |}} |
The normal is color-encoded based on the approximately equidistant LAB colorspace. The next image shows the encoding after applying the Hammer-Aitov projection to a sphere and a simple ray-traced scene. | The normal is color-encoded based on the approximately equidistant LAB colorspace. The next image shows the encoding after applying the Hammer-Aitov projection to a sphere and a simple ray-traced scene. |