grayscale images in compute shader

tosh007 · December 23, 2015, 1:49pm

Hello,
Im trying to use a compute shader on grayscale image.
However, according to the manual, I need to use a “sized” image format, and there seems to be no sized format for single - channel images.
So, my Questions are:

can a texture’s format be changed to a sized format AFTER loading it e.g. from a png file, to something accesible by fragment shaders?
How do i acces single- channel images in a compute shader? (I want to compute over a hightmap of 8192*8192 pixels so i cant simply add some useless other channels)
How does the format of a texture (Texture.getFormat()) play together with the defination inside the compute shader, layout (binding=0, FORMAT) readonly uniform image2D img_in;

tobspr · December 23, 2015, 2:08pm

There are sized formats, for 16 bit that is Texture.F_r16 and for 32 bit Texture.F_r32.
For 8 bit I’m not sure, but you might try Texture.F_red with a component type of Texture.T_unsigned_byte.

To answer your questions:

You can set the format afterwards, but it has to match the format the texture has - panda does no conversion for you IIRC.
Just use them as regular samplers, and then fetch the x compoment. Example:

uniform sampler2D MyHeightmap;
void main() {
 // bla
  float height = textureLod(MyHeightmap, coord, 0).x; // or texelFetch, whatever
}

Keep in mind you should always use texture() instead of imageLoad(), since thats way faster (due to sampler2D having a texture cache - image2D does not have that).
So if you are only reading from a texture, bind it as sampler2D. If you write to it, bind it as image2D.
If you are writing and reading, you might bind the texture twice (under different names, e.g. “myTexRead” and “myTexWrite”) and use a sampler2D for reading and a writeonly image2D for writing.

The format declared in the compute shader should usually match the format in panda. However, I think it is possible to specify a different format, e.g. an integer format while writing to a floating point texture. I’m not sure what results you get with that tho.

tosh007 · December 23, 2015, 4:47pm

First, thanks for the fast answer!
Using F_r16 seems to work perfectly; also thanks for the sampler tip, i’ll try it.
(Didnt see the new identifier because i was in the old 1.8 API reference

But when loading the png hightmap, I came across a behaviour i don’t understand:
Although stored as 16bit grayscale PNG, getFormat() returns " F_luminance" Can I simply assume that, altought the format isn’t “sized”, The texture is still in a 16Bit format, and thus change the format to F_r16?

tobspr · December 23, 2015, 4:57pm

Pandas loader doesn’t necesarily apply a sized format when loading the texture, but in your case a more generic format (F_luminance). I guess this is because in earlier versions of panda (without compute shader support), it was not really necessary to use a sized format (rdb?).

If you know your Texture is in the F_r16 format (single channel, 16-bit precision), then its perfectly fine to just set the format to be able to use it as a image(-2D). If you have a 8 bit precision texture only tho, and apply a 16 bit format, panda will crash or trigger an assertion IIRC.

If you set “gl-immutable-texture-storage #t” in your configuration, panda also should automatically set the correct sized format for your texture AFAIK.

Sorry for all those “IIRC” and “AFAIK”, I’m not 100% about everything, rdb will hopefully correct me if I am wrong tho

tosh007 · December 23, 2015, 5:07pm

Damn, i just came across a really strange behaviour.
My first test program was simply copying the hightmaps values into another image, using following code:

main.py:

from panda3d.core import Shader, ComputeNode, Texture, NodePath, ShaderAttrib, Filename

from panda3d.core import loadPrcFileData
loadPrcFileData("", "gl-debug #t")
from direct.directbase import DirectStart

# load source image
img_in = loader.loadTexture("hightmap.png")

img_in.setFormat(Texture.F_r16)
img_in.setName("img_in")
img_out = Texture("img_out")
img_out.setup2dTexture(8192,8192,Texture.T_float, Texture.F_r16)
img_out.makeRamImage()
img_out.setKeepRamImage(True)

shader = Shader.load_compute(Shader.SL_GLSL, "bw.glsl")
dummy = NodePath("dummy")
dummy.set_shader(shader)
dummy.set_shader_input("img_in", img_in)
dummy.set_shader_input("img_out", img_out)
sattr = dummy.get_attrib(ShaderAttrib)

base.graphicsEngine.dispatch_compute((8192/32, 8192/32, 1), sattr, base.win.get_gsg())
base.graphicsEngine.extractTextureData(img_out,base.win.get_gsg())
img_out.setFormat(Texture.F_luminance)
img_out.write(Filename("output.png"))

bw.glsl:

#version 430

// Declare the texture inputs
layout (binding=0, r16) readonly  uniform image2D img_in;
layout (binding=1, r16) writeonly uniform image2D img_out;
layout (local_size_x = 32, local_size_y = 32) in;

void main() {

  ivec2 texelCoords = ivec2(gl_GlobalInvocationID.xy);
  vec4 pixel = imageLoad(img_in, texelCoords);
  imageStore(img_out, texelCoords, pixel);
}

At the Beginning, all looked fine. When comparing the images in win media player, i saw the same image.
but somehow, output texture, althought in 16bit format, is just halph the size of the original file,
from 68867kb to 31068kb
Any Idea how this could happen?

tobspr · December 23, 2015, 5:15pm

First, I think you should use “r16f” instead of “r16”. I’m not sure if it makes any difference tho.
Also, you don’t have to call makeRamImage() and setKeepRamImage(), as soon as you call extractTextureData thats basically done for you. Since you are only writing to it, it does not really matter.

You also don’t have to set F_luminance on your out image.
Besides of that, I can’t find any obvious errors. Maybe your source image was not compressed? Are you sure it really has 16 bit?

tosh007 · December 24, 2015, 11:42am

The image was directly exported from l3dt, so i think it is both uncompressed and 16 Bit.

SOLVED:
when using getComponentType() on the freshly loaded texture it returned 1=T_unsigned_short.
Setting the same component type to the output texture did the trick.

Now, could you give me some explanation on what componentType actually does?
Isn’t the component type already fixed by the bitdepth specified in the format?
thanks in advance

tobspr · December 24, 2015, 12:13pm

Ah well, if it still has unsigned_byte set, it maybe writes 8 bit.
Its a bit a mess, theroretically the component type should specify the data type (float, int) and the format specifies the size and channels. However, there is T_unsigned_byte (=8 bit), T_unsigned_short (=16 bit) and T_int (=32bit) for the integer types for example, which also implicitely suggest a size.
I guess this is again, because earlier no sized formats were used. However, I agree, its a bit confusing, and should probably be changed.

rdb · December 24, 2015, 4:09pm

For what it’s worth, “component_type” controls the data type in RAM, as accessed through the Texture interface. “format” indicates not only the number of components, but also the format that Panda asks OpenGL to store the texture in on the GPU (a request that may or may not be honoured).

tosh007 · December 25, 2015, 11:31am

Just found another problem. This time, I think its a Bug. Not sure if it is my graphics driver tho.
Situation is as follows:

first, a 2D texture is setup, then some data is written to it using imageStore() in GPU, as a compute shader.

On some image sizes, the call to extractTexture or texture.write takes infinity, panda window becomes grey, and clicking on it gives me a windows crash report, saying:
Of course, the output file is empty afterwards.

Tested with following image sizes:
256,256: no crash
255,255 crash
1025,1025: no crash
129,129: crash
128,128: no crash
1025, 1025: no crash
to some up, only uneven image sizes seem to be affected.
not every uneven image size results in a crash.

  Problemereignisname:	APPCRASH
  Anwendungsname:	python.exe
  Anwendungsversion:	0.0.0.0
  Anwendungszeitstempel:	5560ae54
  Fehlermodulname:	StackHash_cff8
  Fehlermodulversion:	6.1.7601.19045
  Fehlermodulzeitstempel:	56259295
  Ausnahmecode:	c0000374
  Ausnahmeoffset:	00000000000bffc2
  Betriebsystemversion:	6.1.7601.2.1.0.768.3
  Gebietsschema-ID:	1031
  Zusatzinformation 1:	cff8
  Zusatzinformation 2:	cff87d79f9b1a3b25a1935a13c1fd06a
  Zusatzinformation 3:	d2c7
  Zusatzinformation 4:	d2c7b514c68f3cdee3fe42a00e5a8eca

Test Program:

from panda3d.core import Shader, ComputeNode, Texture, NodePath, ShaderAttrib, Filename
from direct.directbase import DirectStart
shader_rect = Shader.load_compute(Shader.SL_GLSL,"bw_rect.glsl")
img_out = Texture("img_out")
# here, set the image size.
img_out.setup2dTexture(255,255,Texture.T_unsigned_short, Texture.F_r16)
dummy = NodePath("dummy")
dummy.set_shader(shader_rect)
dummy.set_shader_input("img_out", img_out)
sattr = dummy.get_attrib(ShaderAttrib)
base.graphicsEngine.dispatch_compute((128/32+1, 128/32+1, 1), sattr, base.win.get_gsg())
base.graphicsEngine.extractTextureData(img_out,base.win.get_gsg())
img_out.setFormat(Texture.F_luminance)
img_out.write(Filename("output.png"))

Shader:

#version 450

layout (binding=0, r16) writeonly uniform image2D img_out;
layout (local_size_x = 32, local_size_y = 32) in;


void main() {
  ivec2 texelCoords = ivec2(gl_GlobalInvocationID.xy);
  vec4 pixel = vec4(1,0,0,1);
  imageStore(img_out, texelCoords, pixel);

}

I’m on win 7, 64Bit.
Can someone test this to confirm?
would be good to know

tobspr · December 25, 2015, 11:58am

Did you disable textures-power-2? You can do so with “textures-power-2 none”. If you don’t disable it, panda will try to upscale your texture to the next power of two.
It looks like panda cannot handle the the format you use, so when trying to upscale it, something goes wrong.

tosh007 · December 25, 2015, 12:05pm

Added textures-power-2 in config.prc. also tried to load it from loadPrcFileData(“”, “textures-power-2 none”)
Did not help.
Also tried the other options, still crashing.

EDIT:
I did further testing, and the problem seems to be related to the x-axis of the texture size.
128x129 texture did not crash. However, it crashed on 129x128 and on 129x129. Still, there are exceptions, like 1025x1025. 1025x1024 crashes. really strange.

EDIT2:
with Texture.T_Float, no crash occurs, so it seems to be related to that.
Also, I just found out that extractTexture() DOES RETURN. Executing normal python code after that does not make any problem, but involving the texture in any way crashes panda. even if texture.write is not called, panda3d crashes at the end. I guess this is because panda3d tries to release the texture memory when python exits, and somehow the results of extractTexture are corrupted. Also interessting: according to windows error log, the 0xc0000005 (acces violation) happens within ntdll.dll

Do I possibly have a corrupt dll? Or is the error somewhere else?

tobspr · December 25, 2015, 2:22pm

tosh007:

Did you disable textures-power-2? You can do so with “textures-power-2 none”.

Added textures-power-2 in config.prc. also tried to load it from loadPrcFileData(“”, “textures-power-2 none”)
Did not help.
Also tried the other options, still crashing.

EDIT:
I did further testing, and the problem seems to be related to the x-axis of the texture size.
128x129 texture did not crash. However, it crashed on 129x128 and on 129x129. Still, there are exceptions, like 1025x1025. 1025x1024 crashes. really strange.

EDIT2:
with Texture.T_Float, no crash occurs, so it seems to be related to that.
Also, I just found out that extractTexture() DOES RETURN. Executing normal python code after that does not make any problem, but involving the texture in any way crashes panda. even if texture.write is not called, panda3d crashes at the end. I guess this is because panda3d tries to release the texture memory when python exits, and somehow the results of extractTexture are corrupted. Also interessting: according to windows error log, the 0xc0000005 (acces violation) happens within ntdll.dll

Do I possibly have a corrupt dll? Or is the error somewhere else?

I can reproduce the crash. It seems to be related to the integer saving code. I’ll have a look at it.

EDIT: Okay the crash is related to the driver align the texture data, whereas panda does not expect the data to be aligned, so that the memory gets corrupted. There should be a patch soon which fixes this.

EDIT #2: Okay the patch is committed, and should be in the next buildbot version.

tosh007 · December 26, 2015, 10:12am

Thanks a lot, everything seems to work now!