AR openCV ArUco - How to apply pose estimation to panda camera?

Hi y’all,

I’m working on an augmented reality python app, using openCV ArUco module to detect markers and perform pose estimation of the user camera.
Pose estimation works fine, but I’m having issues correctly converting the pose data from ArUco space coordinates to Panda space coordinates.

Here is a video where you can see the issue (rotation on the axis looks good but when combining multiple axis rotation the outcome is not correct):

I’ve made a Github repository with a working python app (cam video included, just run ‘main.py’ if interested):

Here is an excerpt of the code where the coordinate conversion happens (complete code is at Github: ‘poseEstimation.py’):

# Brief description:
# ArUco pose estimation gives us the relative position of the marker and the
# camera: Rotation vector 'rvecs' and translation vector 'tvecs'.
# They represent the transform from the marker to the camera.
# 
# I want to convert the pose data to get the position of the camera in panda
# world (for now we assume the marker is at position 'x=0, y=0, z=0' in panda
# world, to make things simpler), so I can apply the pose to Panda camera
# (to mimic the pose of the real world user camera)

# Pose estimation (ArUco board):
retval, self.rvecs, self.tvecs = aruco.estimatePoseBoard( corners, ids, self.board0, self.cameraMatrix, self.distCoeffs, None, None )

# Convert rotation vector to a rotation matrix:
mat, jacobian = cv.Rodrigues(self.rvecs) # cv.Rodrigues docs: https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga61585db663d9da06b68e70cfbf6a1eac
# Transpose the matrix (following approach found at stackoverflow):
mat = cv.transpose(mat) # cv.transpose docs: https://docs.opencv.org/master/d2/de8/group__core__array.html#ga46630ed6c0ea6254a35f447289bd7404
# Invert the matrix (following approach found at stackoverflow, supposed to convert pose data from marker coordinate space to camera coordinate space): 
retval, mat = cv.invert(mat) # cv.invert docs: https://docs.opencv.org/master/d2/de8/group__core__array.html#gad278044679d4ecf20f7622cc151aaaa2

# Create panda matrix so we can apply the data to a node via '.setMat()':
mat3 = Mat3() 
mat3.set(mat[0][0], mat[0][1], mat[0][2], mat[1][0], mat[1][1], mat[1][2], mat[2][0], mat[2][1], mat[2][2] )
mat4 = Mat4(mat3)

# From here on, pretty much all the values are assigned by trial-and-error, to see what works and what doesn't.

# ROTATION:
# Apply pose estimation rotation matrix to a dummy node:
panda.matrixNode.setMat( mat4 ) 
# Apply the matrixNode HPR to another dummy node with some trial-and-error modifications:
# H=zRot, P=xRot, R=yRot
panda.transNode.setH( - panda.matrixNode.getR() ) # zRot
panda.transNode.setP( panda.matrixNode.getP() + 180 ) # xRot
panda.transNode.setR( - panda.matrixNode.getH() ) # yRot

# TRANSLATION:
# Place dummy node to a marker position in panda world coordinates (marker is at 0,0,0)
panda.transNode.setPos(0, 0, 0) 
# Use translation data from pose estimation:
xTrans = self.tvecs[0][0]
yTrans = self.tvecs[1][0]
zTrans = self.tvecs[2][0]
# Apply translation with negative values, seems to work (trial-and-error):
panda.transNode.setPos(panda.transNode, -xTrans, -zTrans, yTrans)

# Assign values to be applied to panda camera position:
camX = panda.transNode.getX()
camY = panda.transNode.getY()
camZ = panda.transNode.getZ()
camH = panda.transNode.getH()
camP = panda.transNode.getP()
camR = panda.transNode.getR()

I’ve seen similar topics on this forum but none was able to fix my issue :confused:
Any help is much appreciated :slight_smile:

Resources:
OpenCV ArUco documentation
Description of the openCV ArUco module (section ‘Pose Estimation’)

Hi, welcome to the community!

It looks like OpenCV is giving a Y-up matrix, whereas Panda expects a Z-up matrix (unless you set panda to Y-up mode). Rather than applying manual operations to the rotation, I suggest multiplying your matrix with the Z-up to Y-up conversion matrix that is provided by Panda.

Try this instead to set the rotation:

panda.transNode.setMat(Mat3.convert_mat(CS_yup_right, CS_zup_right) * mat3)

You could simplify the code to not use a dummy node this way:

mat = Mat4.convert_mat(CS_yup_right, CS_zup_right) \
    * Mat4.translate_mat(-xTrans, -yTrans, -zTrans) \
    * Mat4(mat3)

xform = TransformState.make_mat(mat)
panda.poseData['pandaCamPose']['trans'] = list(xform.pos)
panda.poseData['pandaCamPose']['rot'] = list(xform.hpr)
1 Like

Wow! Works like a charm :slight_smile: Thank you so much for taking the time to look into the issue and helping me out :slight_smile:

I have a small follow up question - the scene is oriented as if the marker was laying down on the floor (z-axis up), in which case this outcome would be correct. Any idea on how would I go about setting things up correctly for a marker placed on a wall?

I’ve tried to fiddle around with the code, but the matrix operations are giving me a hard time :confused:

screenshot

Just apply another rotation matrix at the end:

mat = Mat4.convert_mat(CS_yup_right, CS_zup_right) \
    * Mat4.translate_mat(-xTrans, -yTrans, -zTrans) \
    * Mat4(mat3) \
    * Mat4.rotate_mat(90, (1, 0, 0))

Note that the order of multiplication operations matters.

1 Like

Seems to work great, thanks again!

I will have a deeper look into the matrix operations later when I’m less busy, to learn how to work with the markers when they are placed around the room with various orientation and position, so I can properly sync real world camera with panda camera to display AR visualization content