In the pinhole model, the points in space are projected on the image plane (of the idealized camera sensor) by using a linear transformation if we care to use homogeneous coordinates. We would need the 3x4 camera matrix C, which is obtained by putting the camera intrinsic and extrinsic together. The formula for projection will be
To get the un-homogeneous coordinates on the image plane, we simply rescale everything by
We can use this fact to rewrite the projection as
Here we chose to have the last parameter of the camera matrix as 1. Notice we will end up again with the same ( cancels out) as you can see
Now consider a plane in the world space, and imagine we aligned the reference system in a way Z=0 for all points on this plane.
For all the point laying on the plane, we can use a simplified version of the projection. The third column of the camera matrix will be going to be irrelevant (multiplied by Z=0 in fact). So we can simplify and write
The projection for a given plane in the world space to the image plane of the camera is called homography and can be expressed by a 3x3 matrix, with 8 degrees of freedom (the last parameter can be always one, due to scale invariance).
Now imagine a second camera, looking at the points on the same designated plane
We could do something like the following: from the image plane of the first camera we could apply an inverse projection to get the points on the world plane, then project them on the second camera image plane by using its own camera matrix. We could, in other words, combine the two transformations and obtain a single homography H which will project co-planar points (in the world space) from one image to the other. Mind that this change of prospective will work only for points on that same plane, as the general case would need a 4x4 matrix M.
The mapping equation for the homography will be
Even if H is the result of combining two camera matrices, you don’t need to know them to calculate H. There are 8 degrees of freedom, so you would need just 8 equations for the 8 unknowns. Those equations could be written by using 4 couples of corresponding points in the given images planes.
How do we obtain those equations? So, we have for points on a plane in world space. Written explicitly:
Taking into account the fact that non homogeneous coordinates are
then we can write
Rearranging the above we get
Now, if we define the following three vectors
Then the above equations become simply
If we stack at least 8 equations, using 4 couples of corresponding points, then we can form the following linear system of equation
Well, we are basically done, as we can use
tf.matrix_solve_ls to solve the linear system. Hereafter the code
def ax(p, q): return [ p, p, -1, 0, 0, 0, -p * q, -p * q ] def ay(p, q): return [ 0, 0, 0, p, p, 1, -p * q, -p * q ] def homography(x1s, x2s): p =  # we build matrix A by using only 4 point correspondence. The linear # system is solved with the least square method, so here # we could even pass more correspondence p.append(ax(x1s, x2s)) p.append(ay(x1s, x2s)) p.append(ax(x1s, x2s)) p.append(ay(x1s, x2s)) p.append(ax(x1s, x2s)) p.append(ay(x1s, x2s)) p.append(ax(x1s, x2s)) p.append(ay(x1s, x2s)) # A is 8x8 A = tf.stack(p, axis=0) m = [[x2s, x2s, x2s, x2s, x2s, x2s, x2s, x2s]] # P is 8x1 P = tf.transpose(tf.stack(m, axis=0)) # here we solve the linear system # we transpose the result for convenience return tf.transpose(tf.matrix_solve_ls(A, P, fast=True))
I created also a jupyter notebook on github as an example.
That’s all folks.
Stefano software developer...