i’m writing a projector calibration routine for displaying on physical 3d models, like a lamp or a mannequin. the procedure is the dual of the gold standard camera calibration algorithm outlined in multiple view geometry by hartley and zisserman.
the brief outline is that a number of points on the physical model are registered with their locations is image space (the coordinate plane of the projector’s lcd, say) and from this correspondence, a 3x4 matrix can be derived. this matrix, the ‘camera matrix’, takes homogeneous points in 3d all the way to homogeneous coordinates in 2d, effectively transforming from model space to window space. going all the way to window space is a no-no in current graphics apis, i think. usually we only go as far as clip space and then ask the hardware to do everything else. i can derive a transformation to clip space by transforming the image-space points to clip space before solving the correspondence.
as far as i can tell, the matrix ‘works’. that is, if i back-project the world-space points using the matrix, the points returned are accurate to 3 decimal places.
the matrix can be decomposed to give a translation vector and rotation matrix representing the extrinsic parameters of the camera. the rotation matrix can be unwound to determine the camera’s hpr and this is nice. if i apply(1) these extrinsic parameters to base.camera, and display a virtual model of the object that i’ve used to calibrate, things are in pretty good shape. everything seems perfect except the world is offset by a few pixels in each direction.
ok, fine. the camera matrix is aware of this. the way we found the rotation matrix was by taking the first three columns of the camera matrix and performing Givens rotation RQ decomposition. the matrix Q is the rotation matrix, and the matrix R is a right (upper) triangular matrix, which is the camera’s projection matrix. the vector (R[0,2],R[1,2]) is called the camera center. typically, this is the center of the image ((width/2,height/2) for corner origins or (0,0) for center origin). my matrix has non-zero values in these entries, and they correspond to the offsets i’m seeing when i use the standard PerspectiveLens’s projection matrix. but i don’t know how to tell panda to use this offset(1).
(1) the big problem here is that i’m having a very hard time coping with panda’s coordinate system/transformation matrices:
my derivation of the translation vector went just right – (x/w,y/w,z/w) == camera.getPos() == camera.getMat().getCol(3). but why is the translation vector in the last column instead of the last row? oh. because Mat4.xform() multiplies vM rather than Mv_transpose. ok.
the rotation matrix was more problematic. to get from my matrix Q to the upper 3x3 of camera.getMat() i had to 1) swap rows 1 and 2, 2) transpose, and 3) scale by -1. the transpose is consistent with my experience with the translation vector. i assume the row (now column) swapping has to do with the difference between y-up and z-up coordinate spaces? and i lost my will before exploring the negation.
there’s a similar problem with the projection matrix. camera.getChild(0).node().getLens().getProjectionMat() has rows 1 and 2 swapped from the perspective of opengl or direct3d projection matrices. i assume that only the ratio of the focal lengths is important, right? not the focal length values themselves? also, my projection matrix, R, is 3x3. i’m not sure how to transform it into homogeneous coordinates. also, obviously, it contains parameters that are typically not part of a projection matrix (this offset vector, for instance), and i’m not sure if i should try to incorporate them in the virtual camera’s projection matrix or not. i’m not interested in the camera’s near and far planes. would turning the depth test off disable clipping? finally, the camera center is actually a homogeneous point consisting of the third column of the projection matrix, so those offsets are only appropriate after dividing by R[2,2].
basically, nothing i’ve done has given me the results i really want. here’s what i’ve tried:
-manually setting the pos and hpr of the camera (as described above). this got me all the way to the offset problem
-moving the camera to the origin and setting a MatrixLens’s userMat with my camera matrix as follows:
camera.setPos(Vec3(0))
camera.setHpr(Vec3(0))
lensMat = Mat4()
lensMat.setRow(0,camMat.getRow(0))
lensMat.setRow(1,camMat.getRow(1))
lensMat.setRow(2,lensMat.getRow(3)) #wbuffer
lensMat.setRow(3,camMat.getRow(2))
newLens = MatrixLens()
newLens.setUserMat(lensMat)
camera.getChild(0).node().setLens(newLens)
this failed in a big way when my matrix was transforming to window space. i have not tried this trick again with a camera matrix that transforms to clip space.
-passing the camera matrix into a shader program and replacing the using it in place of the modelviewprojection matrix (i replace the model matrix as well as the view and projection because i’m assuming the object that i’m projecting on is at render’s origin). this didn’t work, but i’m not confident that i passed the parameter appropriately. i’ve tried:
uniform float4x4 k_camera_mat
with
myCam=NodePath('cam')
myCam.setMat(lensMat) #or myCam.setMat(camMat)
render.setShaderInput('camera_mat',myCam)
i didn’t try to use a trans_x_to_y_z parameter because i think that’ll take me back to the problems i was having before with cameras.
-lastly, i spent a bunch of time in interactive mode trying to multiply transforms (the camera NodePath transform, the projection matrix, etc) together to get a transform that performs the same action as the camera matrix, which has failed. here’s a fundamental question: if Ax = a and Bx = b where a and b are equivalent modulo perspective division, what is the relationship between A and B?
my experiments have not been exhaustive, though they have exhausted me. the next thing i will try is to return to manually setting the camera NodePath’s pos and hpr and then feeding the offset vector from the intrinsic matrix R into a vertex program that will adjust the vertex coordinates uniformly by that offset after the standard modelviewprojection transform. i’m hopeful that this hack will get me good results for the time being, but i’d really like to know what the right way to deal with this data is so i can do things in The Right Way.
jeremy