Get distance to camera with shaders and buffers

I’m trying to simulate a depth sensor, that captures the distance from the camera to the object which created each pixel.

Note that this differs from the usual depth buffer, which captures the distance from the camera plane, rescaled non-linearly [A].

My intended approach was to use shader(s) which calculated this distance. I’m not too familiar with shaders, but after some reading, believe that this vertex shader would be something like this:

#version 400
 
// Uniform inputs
uniform mat4 p3d_ModelViewMatrix;
uniform mat4 p3d_ProjectionMatrix;
 
// Vertex inputs
in vec4 p3d_Vertex;
 
// Output to fragment shader
out float distanceToCamera;

void main() {
  cs_Position = p3d_ModelViewMatrix * p3d_Vertex;
  distanceToCamera = length(cs_Position.xyz);
  gl_Position = p3d_ProjectionMatrix * cs_Position
}

However, I’m not really sure where to go after this. My questions are:

  1. Do I also need a fragment shader?
  2. How do I get this into a buffer so I can access from the CPU?

Thank you for the help!

A. I believe it’s 1/distance, adjusted to be between 0 at the front culling plane and 1 at the back culling plane.

For a perspective camera, would this not simply be cs_Position.z / cs_Position.w? I could be wrong.

You would need to declare cs_Position (ie. float4 cs_Position;) and add a semicolon after the last line for this code to compile, but the concept otherwise looks right.

You do need a fragment shader if you also have a vertex shader, unless you want to use imageStore to write the result to a texture in the vertex shader (but you would get the data per-vertex instead of per-pixel). You can use render-to-texture with RTM_copy_ram mode to automatically copy the texture data into CPU memory.

The alternative approach is to not use a shader, but just render a regular depth buffer and rescale the depth values back to the range needed after the image is rendered.

I believe that cs_position.z gives the distance along the dimension aligned with the camera barrel, which is not what I want (I want the distance from the camera to the object). In the illustration below (where the camera points straight down), I want d, not z.

            (camera)
---------------o---------------
               |\
               | \
             z |  \ d
               |   \
               -----x (object)

The issue with the depth buffer, is I believe it also uses z (rescaled), so to recover d you need to do (slow) geometry with the depth buffer value and the pixel position. Unless I’m misunderstanding something, this means it might be best to use shaders.

However, I’ve been trying all day to get the (simplest?) shaders example working, to no avail. I’ve asked this as a separate question since I thought it would be independently useful. Once I figure out how to get that simple example running, I’ll return to this and see if I can get it working.

Okay, after getting the simple shader example working (thanks @rdb), I now have some code that simulates a depth camera (see the attached images).

My shaders are below.

I still have a question about how best to retrieve the distanceToCamera variable. As you may see, right now I’m rendering a texture with RGBA format where the RGB channels are all distanceToCamera/10 (otherwise I get values > 1 and the depth image is just white). I then load the buffer into RAM and I can get distanceToCamera by rescaling any of the channels.

However, what I’d really like to do is write a float representing the actual distance to the camera (not rescaled) into a single channel, and an alpha. Is there a good way to do this? I’m particularly confused about how to set up the texture/buffer part to do this.

One other question, is do you know if the interpolation of distanceToCamera between vertices is linear?

Vertex Shader

#version 150

// Uniform inputs
uniform mat4 p3d_ProjectionMatrix;
uniform mat4 p3d_ModelViewMatrix;

// Vertex inputs
in vec4 p3d_Vertex;

// Vertex outputs
out float distanceToCamera;

void main() {
  vec4 cs_position = p3d_ModelViewMatrix * p3d_Vertex;
  distanceToCamera = length(cs_position.xyz);
  gl_Position = p3d_ProjectionMatrix * cs_position;
}

Fragment Shader

#version 150

in float distanceToCamera;

out vec4 fragColor;

void main() {
  fragColor = vec4(distanceToCamera / 10, distanceToCamera / 10, distanceToCamera / 10, 1);
}

Original Image

Depth Image

Yes. I don’t know specifically which API you are using to create the buffer, but there should be a way to specify a FrameBufferProperties on which you called setFloatColor(True) and setRgbaBits(32, 0, 0, 0) to indicate that you need only a single, but high-resolution, channel.

You can use the TexturePeeker to extract the data from the texture by pixel, or the raw getRamImage method for accessing the raw data or putting into another API, or the PfmFile class if you wish to do further operations on it.

The interpolation of variables between vertex and fragment shader is linear (I believe using the barycentric coordinates of the triangle).

Okay, this was very helpful. I think I’ve almost got it (see the image below).

My updated fragment shader and script are below. I just have a couple more questions:

  1. Is this the right approach in the fragment shader, to just set the green and blue channels to 0?
  2. I need some way of distinguishing between true depth hits in the simulated depth camera, and when there is simply no object in range:
    a. One approach would be to include a 1-bit mask alpha channel that will tell me this. Is this possible?
    b. The other approach is to return a predetermined max depth value whenever there is no object in sight. I tried to do this by specifying a clear color for the texture and then clearing the texture, but it did not work (otherwise we’d see values of 20 in the depth map, instead of 1). I want to be able to have multiple depth cameras with different max depth values, so I can’t just change the global default clear color. Is there any way to achieve this?

Depth Map

Fragment Shader

#version 150

in float distanceToCamera;
out vec4 fragColor;

void main() {
  fragColor = vec4(distanceToCamera, 0, 0, 1);
}

Code

import os
import numpy as np
from direct.showbase.ShowBase import ShowBase
from panda3d.core import FrameBufferProperties, WindowProperties
from panda3d.core import GraphicsOutput, GraphicsPipe
from panda3d.core import Texture, PerspectiveLens, Shader
from panda3d.core import ConfigVariableString, ConfigVariableBool, Filename, ConfigVariableManager
from panda3d.core import LVecBase4
import matplotlib.pyplot as plt


ConfigVariableString('background-color', '1.0 1.0 1.0 0.0')  # sets background to white


class SceneSimulator(ShowBase):

  def __init__(self):
    ShowBase.__init__(self)

    # set up texture and graphics buffer
    window_props = WindowProperties.size(1920, 1080)
    frame_buffer_props = FrameBufferProperties()
    frame_buffer_props.set_float_color(True)
    frame_buffer_props.set_rgba_bits(32, 0, 0, 0)
    buffer = self.graphicsEngine.make_output(self.pipe,
      f'Buffer',
      -2,
      frame_buffer_props,
      window_props,
      GraphicsPipe.BFRefuseWindow,    # don't open a window
      self.win.getGsg(),
      self.win
    )
    texture = Texture()
    texture.set_clear_color(LVecBase4(20, 20, 20, 0))
    texture.clear_image()
    buffer.add_render_texture(texture, GraphicsOutput.RTMCopyRam)
    self.buffer = buffer

    # place a box in the scene
    x, y, side_length = 0, 0, 1
    box = self.loader.loadModel("models/box")
    box.reparentTo(self.render)
    box.setScale(side_length)
    box.setPos(x - side_length / 2, y - side_length / 2, 0)

    # set up camera
    lens = PerspectiveLens()
    lens.set_film_size(1920, 1080)
    lens.set_fov(45, 30)
    pos = (0, 6, 4)
    camera = self.make_camera(buffer, lens=lens, camName=f'Camera')
    camera.reparent_to(self.render)
    camera.set_pos(*pos)
    camera.look_at(box)
    self.camera = camera

    # load shaders
    vert_path = '/Users/michael/mit/sli/scene/scene/glsl_simple.vert'
    frag_path = '/Users/michael/mit/sli/scene/scene/glsl_simple.frag'
    custom_shader = Shader.load(Shader.SL_GLSL, vertex=vert_path, fragment=frag_path)

    self.render.set_shader(custom_shader)

  def render_image(self) -> np.ndarray:
    self.graphics_engine.render_frame()
    texture = self.buffer.get_texture()
    data = texture.get_ram_image()
    frame = np.frombuffer(data, np.float32)
    # frame.shape = (texture.getYSize(), texture.getXSize(), texture.getNumComponents())
    frame.shape = (texture.getYSize(), texture.getXSize())
    frame = np.flipud(frame)
    return frame


if __name__ == '__main__':
  simulator = SceneSimulator()
  image = simulator.render_image()
  plt.imshow(image)
  plt.colorbar()
  plt.show()

Just make a reservation I am a noob in shaders, but I remember a similar topic.

If your framebuffer has only a red channel, the other channels are ignored.

Using the clear is the right thing to do. However, the clear value should be specified on the buffer, not on the texture. Use buffer.set_clear_color_active(True) and buffer.set_clear_color((v, 0, 0, 0)).

You can’t just add a 1-bit alpha channel because there’s no texture format that has 31 red bits and 1 alpha bit. However, if you can’t use the color channel for some reason, you could use the depth buffer or the depth-stencil buffer to determine whether anything is there.

Amazing! It’s all working now.

Thank you again for your help!

If anyone is ever trying to replicate this, the working shaders and code can be found below.

Vertex Shader

#version 150

// Uniform inputs
uniform mat4 p3d_ProjectionMatrix;
uniform mat4 p3d_ModelViewMatrix;

// Vertex inputs
in vec4 p3d_Vertex;

// Vertex outputs
out float distanceToCamera;

void main() {
  vec4 cs_position = p3d_ModelViewMatrix * p3d_Vertex;
  distanceToCamera = length(cs_position.xyz);
  gl_Position = p3d_ProjectionMatrix * cs_position;
}

Fragment Shader

#version 150

in float distanceToCamera;
out vec4 fragColor;

void main() {
  fragColor = vec4(distanceToCamera, 0, 0, 1);
}

Code

import os
import numpy as np
from direct.showbase.ShowBase import ShowBase
from panda3d.core import FrameBufferProperties, WindowProperties
from panda3d.core import GraphicsOutput, GraphicsPipe
from panda3d.core import Texture, PerspectiveLens, Shader
from panda3d.core import ConfigVariableString, ConfigVariableBool, Filename, ConfigVariableManager
import matplotlib.pyplot as plt


class SceneSimulator(ShowBase):

  def __init__(self):
    ShowBase.__init__(self)

    # set up texture and graphics buffer
    window_props = WindowProperties.size(1920, 1080)
    frame_buffer_props = FrameBufferProperties()
    frame_buffer_props.set_float_color(True)
    frame_buffer_props.set_rgba_bits(32, 0, 0, 0)
    buffer = self.graphicsEngine.make_output(self.pipe,
      f'Buffer',
      -2,
      frame_buffer_props,
      window_props,
      GraphicsPipe.BFRefuseWindow,    # don't open a window
      self.win.getGsg(),
      self.win
    )
    texture = Texture()
    buffer.add_render_texture(texture, GraphicsOutput.RTMCopyRam)
    buffer.set_clear_color_active(True)
    buffer.set_clear_color((10, 0, 0, 0))
    self.buffer = buffer

    # place a box in the scene
    x, y, side_length = 0, 0, 1
    box = self.loader.loadModel("models/box")
    box.reparentTo(self.render)
    box.setScale(side_length)
    box.setPos(x - side_length / 2, y - side_length / 2, 0)

    # set up camera
    lens = PerspectiveLens()
    lens.set_film_size(1920, 1080)
    lens.set_fov(45, 30)
    pos = (0, 6, 4)
    camera = self.make_camera(buffer, lens=lens, camName=f'Camera')
    camera.reparent_to(self.render)
    camera.set_pos(*pos)
    camera.look_at(box)
    self.camera = camera

    # load shaders
    vert_path = '/Users/michael/mit/sli/scene/scene/glsl_simple.vert'
    frag_path = '/Users/michael/mit/sli/scene/scene/glsl_simple.frag'
    custom_shader = Shader.load(Shader.SL_GLSL, vertex=vert_path, fragment=frag_path)

    self.render.set_shader(custom_shader)

  def render_image(self) -> np.ndarray:
    self.graphics_engine.render_frame()
    texture = self.buffer.get_texture()
    data = texture.get_ram_image()
    frame = np.frombuffer(data, np.float32)
    frame.shape = (texture.getYSize(), texture.getXSize())
    frame = np.flipud(frame)
    return frame


if __name__ == '__main__':
  simulator = SceneSimulator()
  image = simulator.render_image()
  plt.imshow(image)
  plt.colorbar()
  plt.show()