Inverse Projection and Depth Buffer

I am new to Panda3D and I would like to get some help solving an issue I am struggling to solve on my own.
I am writing a script that does the following:
a. Uploading a textured model (“model”) and a rendered image of the model from an unknown point of view (“refIm”)
b. Setting up camera at specific position and orientation in the world.
c. Renders a frame (“rendIm”) from the camera’s point of view
In my project I am trying to find the position and the orientation of the camera that took the reference image (“refIm”). As part of this effort , I would like to get the location in world of specific pixels that are projected to redIm from Model.

Can someone tell me what is the right way (simple) to do it using the Panda3D engine?

To start with, as I recall, pixels in an image correspond to not one point, but to a line, that line being the line that connects the relevant point on the camera’s near-plane with its counterpart in the camera’s far-plane. To explain, look at an arbitrary point on an object in your environment. Now imagine a ray starting at your eye and passing through that point. Any point intersected by that ray would appear to be at the same 2D point in your vision (presuming that the system stayed exactly still, of course). To find a single world-space coordinate, you would want to additionally know the depth of the pixel, I believe.

The “extrude” method of the “Lens” class should give you the line that corresponds to a given pixel–don’t use “extrudeVec”, however, as I believe that this is used for other purposes. You should be able to get the lens from the camera via Camera’s “getLens” method.

As to the depth of the pixel, you should be able to render out your depth-map, which, with a little maths, should give you the relevant depth along the line–perhaps with some margin of error.


I am aware that each 2D point corresponds to a line. I would like to avoid ray tracing my self by using Panda3D engine for this purpose.
Anyway, I wanted to try your suggestion but I couldn’t find a sufficient documentation for the “extrude” method of the “Lens” class, so I have few more questions:

  1. Do you have a code example I might use in order to understand?
  2. I understood (from my google search) that I am required to write a shader in order to get the depth buffer (does it right?). Do I need to use the extrude method inside a shader?


Have you searched the forum? A quick search seems to turn up a few examples.

No, I think that you can get the depth buffer results without writing a shader. This thread seems to have useful information, and I believe that the “Motion Trails” sample program demonstrates (indirectly, I think) a means of doing this by adding a new rendering texture to an extant graphics output (such as the default Panda window). See this API entry, and the one below it, for more information.

Following your advice to try and use “extrude” and “depth buffer” I encountered an example that is using collisionRay() and traverse in order to get the 3D point I am looking for. Are you familiar with it? How accurate is it? How is it implemented?

You want the world coordinates of the pixels on screen?

Sorry, I want to get this. You want to decipher the position and orientation of the camera
based on the rendered image? Is that correct? You ignore that you already have these values anyway?

Hum, I’m not familiar with that approach. Are you sure that it wasn’t intended for picking objects with collision geometry? That’s what I’m inclined to expect, given the use of a CollisionRay and “traverse”–but it’s entirely possible that the class has useful methods that I’m either not aware of, or have forgotten.

Would you link me to the example that you discovered, please?

I think that the mathematical approach will probably be the simplest here, honestly. Just apply the depth-value read from the depth-texture to the line provided by “extrude”, and thus produce a single result-point. I’m not sure of whether the value in the depth-buffer will correspond linearly with the world-space depth of the pixel; you might want to research that.

My understanding, based on the first post, is that doing so is a step towards producing a method of determining the position and orientation of an unknown camera based on a reference image.

Here it is:
self.queue.getEntry() seems to hold the information I wish to have.

I completly agree with you, but I am still having trouble to implement it in my code. Probably due to the fact that i am using offscreen buffer:

def renderImage(showBaseTree, cameraXpos, cameraYpos, cameraZpos, cameraAzimuth, cameraElevation, cameraRoll):
    # render an Image according to PnO
    # Set Camera's Pos,cameraYpos,cameraZpos)     
    # Set Camera's Orientation -90 + cameraAzimuth) #Roll Z   0 - cameraElevation) #Roll X 180 + cameraRoll) #Roll Y
    # Render the frame
    # Create a PNMImage
    screenshot = PNMImage() 
    # Set the display region
    dispReg = showBaseTree.camNode.getDisplayRegion(0)
    # Store the image from GPU to the RAM variable from PNMImage class
    # get Data using screenshot
    rendImage = Image.frombuffer('RGB',
    rendImageArr = np.asarray(rendImage)
    return rendImageArr

The first step I tried is to take the rendered image returned by this function with it’s depth buffer (but failed to)…

You are right.

Ah, yes–this is collision-related, I do believe. You can have a ray collide with visible geometry, so you shouldn’t require the addition of collision geometry, I think. Hmm… Without knowing your algorithm in greater depth, I’m not sure of whether such an approach would help with your overall goal.

I’m not sufficiently familiar with the use of offscreen buffers to be confident in helping here, but what, specifically, is going wrong?

Today, I have a script that works like this:

  1. Upload model + texture
  2. define the parameters of the default Panda camera (nearPlane, farPane, FOV, etc…)
  3. set PnO for this camera
  4. render a frame
  5. save the image into a python variable (matrix) for future processing.

steps 4 and 5 are implemented in the function “renderImage” I posted as an example of my code.
I would like to merge the depth buffer reading into this function.
How should I do it?

I’m not familiar with that process myself, I’m afraid, so I’ll stand aside for someone else to hopefully provide more help.

That said, have you looked at the shadow-mapping sample-programs? Those use depth-buffers, I believe, and may give you some hints regarding where to look in order to acquire your depth-map.

Actually, have you looked at this thread, just beside yours at time of writing? The original poster there seems to be rendering a depth-texture, and the most recent post (again, at time of writing) seems to indicate that they have it working.