Hardware instancing: unknown semantics "INSTANCEID"

Hello,

I’ve already asked for this in the IRC, but today is sunday (in my part of the world), so I think most people don’t read there today.

I was currently trying hardware instancing to improve the performance of my project. I have tried this simple test code found in the forum on my Intel HD4000 with (Arch) Linux, Kernel 3.12.

Python source:

from panda3d.core import loadPrcFileData
loadPrcFileData('', 'basic-shaders-only 0')
loadPrcFileData('', 'sync-video 0')
loadPrcFileData('', 'show-frame-rate-meter 1')
from direct.actor.Actor import Actor
from direct.showbase.DirectObject import DirectObject
from direct.interval.IntervalGlobal import Sequence
import direct.directbase.DirectStart
from panda3d.core import Point3, Vec4, PTA_LVecBase4f, Shader, UnalignedLVecBase4f


class World(DirectObject):
    def __init__(self):
        self.accept("escape", __import__("sys").exit, [0])

        self.model = Actor('panda-model', {'walk': 'panda-walk4'})
        self.model.setScale(0.005)
        #self.model.flattenLight()
        self.model.setPos(-2.7,300,-5)
        self.model.loop('walk')
        self.model.reparentTo(render)
        interval = self.model.posInterval(20, Point3(-2.7, 200, -5), startPos=Point3(-2.7, 300, -5))
        sequence = Sequence(interval)
        sequence.loop()

        k = 256
        offsets = PTA_LVecBase4f.emptyArray(k);
        count = 0
        for i in range(10):
            for j in range(k/10):
                offsets[count] = UnalignedLVecBase4f(Vec4(i * 3, j * -8, 0, 0))
                count += 1
        self.model.setShaderInput('offsets', offsets)
        self.model.setShader(Shader.load('instance.cg'))
        self.model.setInstanceCount(k)


w = World()
run()

Shader:

//Cg
//Cg profile gp4vp gp4fp

void vshader(float4 vtx_position: POSITION,
             float2 vtx_texcoord0: TEXCOORD0,
             uniform float4x4 mat_modelproj,
             int l_id: INSTANCEID,
             uniform float4 offsets[256],
             out float4 l_position : POSITION,
             out float2 l_texcoord0 : TEXCOORD0)
{
  l_position = mul(mat_modelproj, vtx_position + offsets[l_id]);
  l_texcoord0 = vtx_texcoord0;
}


void fshader(float2 l_texcoord0: TEXCOORD0,
             uniform sampler2D tex_0: TEXUNIT0,
             out float4 o_color: COLOR)
{
  o_color = tex2D(tex_0, l_texcoord0);
}

When starting the program, only one panda is visible (I don’t know if multiple were drawn but I see just one when looking around).

The console output:

DirectStart: Starting the game.
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
:gobj(error): instance.cg: uniform in unknown offsets: unsupported array subclass.
:gobj(error): Shader encountered an error.
:gobj(error): instance.cg: (7) : error C5108: unknown semantics "INSTANCEID" specified for "l_id"

I’m not sure if this is a hardware or driver related problem, but according to glxinfo, instancing should be supported:

% glxinfo| grep instance
    GL_ARB_ES3_compatibility, GL_ARB_base_instance, 
    GL_ARB_draw_instanced, GL_ARB_explicit_attrib_location, 
    GL_ARB_half_float_vertex, GL_ARB_instanced_arrays, 
    GL_EXT_draw_instanced, GL_EXT_framebuffer_blit, 
    GL_ARB_draw_elements_base_vertex, GL_ARB_draw_instanced, 
    GL_ARB_instanced_arrays, GL_ARB_internalformat_query, 
    GL_EXT_draw_instanced, GL_EXT_draw_range_elements, GL_EXT_fog_coord,

I think it might have something to do with the NVIDIA Cg compiler or panda itself (maybe does not make an instanced draw call?)

If you need some more system information (Kernel/Mesa version, glxinfo) or something else, just ask :slight_smile:

best regards,
Socke

Which version of Panda3D? If 1.8.1, have you tried upgrading to latest devel?

Its 1.8.1, I will try the devel version later today. I will also try the code on another machine with a nvidia card. I hope to have some results in an hour or two.

Okay, here are some results. I’ve built Panda with the following command:

python2 makepanda/makepanda.py --everything --no-opencv --no-ode --no-ffmpeg --no-maya2012 --no-fmodex --no-gles --no-gles2 --use-bullet --threads 3

After installing and running my test application the first time, I got:

% python2 test.py
DirectStart: Starting the game.
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
:gobj(error): Texture::read() - couldn't read: /usr/share/panda3d/models/maps/panda-model.jpg
:gobj(error): Texture "/usr/share/panda3d/models/maps/panda-model.jpg" exists but cannot be read.
:gobj(error): Texture extension "jpg" is unknown.  Supported texture types:
  SGI RGB                         .rgb, .rgba, .sgi
  Targa                           .tga
  Raw binary RGB                  .img
  SoftImage                       .pic, .soft
  Windows BMP                     .bmp
  NetPBM-style PBM/PGM/PPM/PNM    .pbm, .pgm, .ppm, .pnm
  Portable Float Map              .pfm
  PNG                             .png
  MovieTexture                    .asf
  MovieTexture                    .avi
  MovieTexture                    .flv
  MovieTexture                    .mkv
  MovieTexture                    .mov
  MovieTexture                    .mp4
  MovieTexture                    .mpeg
  MovieTexture                    .mpg
  MovieTexture                    .nut
  MovieTexture                    .ogm
  MovieTexture                    .ogv
  MovieTexture                    .wmv
:gobj(error): Tried to load Cg shader, but no Cg support is enabled.

Only one panda is shown, and it’s untextured. Because I have not disabled any libpng or so, I simply tried launching it again, without changing anything, at least the texture load error disappeared:

% python2 test.py
DirectStart: Starting the game.
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
:gobj(error): Tried to load Cg shader, but no Cg support is enabled.

But the panda stays white and untextured, and instancing still does not work. But maybe the output is a little bit more helpful.

I’ve still not tried it on the NVIDIA desktop pc (had not much time today) because the Debian testing there seems to dislike your pre-packaged versions (libavformat is missing), so I have to compile my own version. At wednesday I will have more time, then I will try it.

Best regards,
Socke

FYI: I tried your original code and I didn’t receive any errors. Devel + Nvidia card though.

Although I had to change one line, as all the pandas were overlapping, so it looked like there was only one:

offsets[count] = UnalignedLVecBase4f(Vec4(i * 300, j * -800, 0, 0))

I’ve changed the code as you recommended and tested again. Result with 1.8.1:

DirectStart: Starting the game.
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
:gobj(error): instance.cg: uniform in unknown offsets: unsupported array subclass.
:gobj(error): Shader encountered an error.
:gobj(error): instance.cg: (7) : error C5108: unknown semantics "INSTANCEID" specified for "l_id"

Again, no instancing, but panda is textured.

And with Devel from cvs:

DirectStart: Starting the game.
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
:gobj(error): Tried to load Cg shader, but no Cg support is enabled.

No instancing, no texture on panda. I don’t know why texturing fails with the devel version, maybe it’s just some caching related problem. In the past, the Intel Linux driver had some issues with compressed textures. But that seems to be fixed (at least for all other OpenGL apps I’ve tried in the last time) and adding the following line has no effect:

loadPrcFileData('', 'compressed-textures 0')

But the texturing is not my problem now. Only some info and thoughts for some dev who maybe wants to fix this issue.

Regards,
Socke

It looks like you have not compiled with Cg support. Install nvidia-cg-toolkit and update your build.

As you can see from the above makepanda options, I haven’t disabled Cg support explicitly (at least I hope so). And nvidia-cg-toolkit was installed during build and is still installed (it’s a dependency of the panda3d-cvs AUR package). But when I’am looking at the source code of panda (grep’ed for the error message) it seems you’re right. HAVE_CG seems to be undefined. I’ll check that, to determine how to correctly compile it.

At the top of makepanda.py, it’ll tell you about missing packages. Adding --verbose will cause makepanda to be more verbose about why it thinks the package is missing.

Now tried for several hours building the cvs version of panda, either with or without the help of yaourt/makepkg. The current CVS version always segfaults on my machine, no matter what I do. Damnit, this started as an instancing issue and now is a “get panda working”-issue. Here is the gdb output, including backtrace:

% gdb --args python2 test.py
GNU gdb (GDB) 7.6.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python2.7...(no debugging symbols found)...done.
(gdb) r
Starting program: /usr/bin/python2 test.py
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
DirectStart: Starting the game.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff39da0b7 in RenderState::compare_to(RenderState const&) const () from /usr/lib/panda3d/libpanda.so
(gdb) bt
#0  0x00007ffff39da0b7 in RenderState::compare_to(RenderState const&) const () from /usr/lib/panda3d/libpanda.so
#1  0x00007ffff3a0eae8 in SimpleHashMap<RenderState const*, RenderState::Empty, indirect_compare_to_hash<RenderState const*> >::is_element(int, RenderState const* const&) const ()
   from /usr/lib/panda3d/libpanda.so
#2  0x00007ffff3a0ebee in SimpleHashMap<RenderState const*, RenderState::Empty, indirect_compare_to_hash<RenderState const*> >::find(RenderState const* const&) const ()
   from /usr/lib/panda3d/libpanda.so
#3  0x00007ffff39f833f in RenderState::return_unique(RenderState*) () from /usr/lib/panda3d/libpanda.so
#4  0x00007ffff39f894b in RenderState::return_new(RenderState*) () from /usr/lib/panda3d/libpanda.so
#5  0x00007ffff39f95a9 in RenderState::set_attrib(RenderAttrib const*, int) const () from /usr/lib/panda3d/libpanda.so
#6  0x00007ffff399e833 in PandaNode::set_attrib(RenderAttrib const*, int) () from /usr/lib/panda3d/libpanda.so
#7  0x00007ffff3ad0f00 in ?? () from /usr/lib/panda3d/libpanda.so
#8  0x00007ffff7af9839 in PyEval_EvalFrameEx () from /usr/lib/libpython2.7.so.1.0
#9  0x00007ffff7af9552 in PyEval_EvalFrameEx () from /usr/lib/libpython2.7.so.1.0
#10 0x00007ffff7afa290 in PyEval_EvalCodeEx () from /usr/lib/libpython2.7.so.1.0
#11 0x00007ffff7a89b00 in function_call () from /usr/lib/libpython2.7.so.1.0
#12 0x00007ffff7a65c13 in PyObject_Call () from /usr/lib/libpython2.7.so.1.0
#13 0x00007ffff7a7441d in instancemethod_call () from /usr/lib/libpython2.7.so.1.0
#14 0x00007ffff7a65c13 in PyObject_Call () from /usr/lib/libpython2.7.so.1.0
#15 0x00007ffff7af40d7 in PyEval_CallObjectWithKeywords () from /usr/lib/libpython2.7.so.1.0
#16 0x00007ffff7a7509c in PyInstance_New () from /usr/lib/libpython2.7.so.1.0
#17 0x00007ffff7a65c13 in PyObject_Call () from /usr/lib/libpython2.7.so.1.0
#18 0x00007ffff7af59e1 in PyEval_EvalFrameEx () from /usr/lib/libpython2.7.so.1.0
#19 0x00007ffff7afa290 in PyEval_EvalCodeEx () from /usr/lib/libpython2.7.so.1.0
#20 0x00007ffff7afa392 in PyEval_EvalCode () from /usr/lib/libpython2.7.so.1.0
#21 0x00007ffff7b09e8c in PyImport_ExecCodeModuleEx () from /usr/lib/libpython2.7.so.1.0
#22 0x00007ffff7b0a0f5 in load_source_module () from /usr/lib/libpython2.7.so.1.0
#23 0x00007ffff7b0ad19 in import_submodule () from /usr/lib/libpython2.7.so.1.0
#24 0x00007ffff7b0af5d in load_next () from /usr/lib/libpython2.7.so.1.0
#25 0x00007ffff7b0b8c8 in PyImport_ImportModuleLevel () from /usr/lib/libpython2.7.so.1.0
#26 0x00007ffff7af259f in builtin___import__ () from /usr/lib/libpython2.7.so.1.0
#27 0x00007ffff7a65c13 in PyObject_Call () from /usr/lib/libpython2.7.so.1.0
#28 0x00007ffff7af40d7 in PyEval_CallObjectWithKeywords () from /usr/lib/libpython2.7.so.1.0
#29 0x00007ffff7af7174 in PyEval_EvalFrameEx () from /usr/lib/libpython2.7.so.1.0
#30 0x00007ffff7afa290 in PyEval_EvalCodeEx () from /usr/lib/libpython2.7.so.1.0
#31 0x00007ffff7afa392 in PyEval_EvalCode () from /usr/lib/libpython2.7.so.1.0
#32 0x00007ffff7b1308f in run_mod () from /usr/lib/libpython2.7.so.1.0
#33 0x00007ffff7b141ae in PyRun_FileExFlags () from /usr/lib/libpython2.7.so.1.0
#34 0x00007ffff7b15319 in PyRun_SimpleFileExFlags () from /usr/lib/libpython2.7.so.1.0
#35 0x00007ffff7b25c2f in Py_Main () from /usr/lib/libpython2.7.so.1.0
#36 0x00007ffff7474bc5 in __libc_start_main () from /usr/lib/libc.so.6
#37 0x0000000000400741 in _start ()

For those who are interested in fixing this… I am not.

This gets a little demotivating… I will now read into the cvs manual to see if I find a code revision that works. Maybe there’s some log or something. Is it possible to find out from the installed files, which revision was used to built the engine? (I kept the old package for backup reasons, before building the new one today). Maybe someone can tell me, how I can install panda in my home directory? I’ve googled a little for it but found nothing in the first hits (would be much more comfortable than always generating arch packages but I don’t want my system to get “dirty”).

Regards,
Socke

Yeah, Updates:

Installed panda3d-cvs from the AUR, and now I get the following output:

% python2 test.py 
DirectStart: Starting the game.
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
:display:gsg:glgsg(error): Could not load Cg vertex program:instance.cg (glslv )

I’m not sure what that means, I think it might want to choose the wrong shader profile. How do I set the correct Cg profile? Is it just that // Cg profile comment in the shader file?

The Panda is shown and even his texture seems to be installed and loaded correctly this time.

Heres the code again (I think I have changed some lines during testing):

from panda3d.core import loadPrcFileData
loadPrcFileData('', 'basic-shaders-only 0')
loadPrcFileData('', 'sync-video 0')
loadPrcFileData('', 'show-frame-rate-meter 1')
loadPrcFileData('', 'compressed-textures 0')
from direct.actor.Actor import Actor
from direct.showbase.DirectObject import DirectObject
from direct.interval.IntervalGlobal import Sequence
import direct.directbase.DirectStart
from panda3d.core import Point3, Vec4, PTA_LVecBase4f, Shader, UnalignedLVecBase4f


class World(DirectObject):
    def __init__(self):
        self.accept("escape", __import__("sys").exit, [0])

        self.model = Actor('panda-model', {'walk': 'panda-walk4'})
        self.model.setScale(0.005)
        #self.model.flattenLight()
        self.model.setPos(-2.7,300,-5)
        self.model.loop('walk')
        self.model.reparentTo(render)
        interval = self.model.posInterval(20, Point3(-2.7, 200, -5), startPos=Point3(-2.7, 300, -5))
        sequence = Sequence(interval)
        sequence.loop()

        k = 256
        offsets = PTA_LVecBase4f.emptyArray(k);
        count = 0
        for i in range(10):
            for j in range(k/10):
                offsets[count] = UnalignedLVecBase4f(Vec4(i * 300, j * -800, 0, 0))
                count += 1
        self.model.setShaderInput('offsets', offsets)
        self.model.setShader(Shader.load('instance.cg'))
        self.model.setInstanceCount(k)


w = World()
run()

And the shader:

//Cg
//Cg profile gp4vp gp4fp

void vshader(float4 vtx_position: POSITION,
             float2 vtx_texcoord0: TEXCOORD0,
             uniform float4x4 mat_modelproj,
             int l_id: INSTANCEID,
             uniform float4 offsets[256],
             out float4 l_position : POSITION,
             out float2 l_texcoord0 : TEXCOORD0)
{
  l_position = mul(mat_modelproj, vtx_position + offsets[l_id]);
  l_texcoord0 = vtx_texcoord0;
}


void fshader(float2 l_texcoord0: TEXCOORD0,
             uniform sampler2D tex_0: TEXUNIT0,
             out float4 o_color: COLOR)
{
  o_color = tex2D(tex_0, l_texcoord0);
}

Regards

Today compiled the git version (revision 799e5d9). The problem still exists:

:display:gsg:glgsg(error): Could not load Cg vertex program:instance.cg (glslv )

I added some checking code:

        print("glslv:", base.win.getGsg().getSupportsCgProfile("glslv"))
        print("glslf:", base.win.getGsg().getSupportsCgProfile("glslf"))
        print("geometry instancing:", base.win.getGsg().getSupportsGeometryInstancing())

// output:
('glslv:', True)
('glslf:', True)
('geometry instancing:', True)

Which seems okay. I thought, maybe glslv simply does not support instancing, and so I tried, compiling the shader manually using cgc:

% cgc -profile glslv instance.cg -entry vshader
% cgc -profile glslf instance.cg -entry fshader

Both work pretty fine and print a glsl shader that is using instancing. WTF? So I thought the problem lies somewhere within the shader compiling process of panda. When looking at the debug output, I found that the shader is compiled correctly but panda fails loading it (to whatever reason the shader is first compiled using the ultimate profile (glslv, glslf) and about 1000 lines later in the log it’s compiling using the active profile (also glslv, glslf)). The source code, where the error occours is:

if (_cg_vprogram != 0) {
      cgGLLoadProgram(_cg_vprogram);
      CGerror verror = cgGetError();
      if (verror != CG_NO_ERROR) {
        const char *str = (const char *)GLP(GetString)(GL_PROGRAM_ERROR_STRING_ARB);
        GLCAT.error() << "Could not load Cg vertex program:" << s->get_filename(Shader::ST_vertex) << " (" << 
          cgGetProfileString(cgGetProgramProfile(_cg_vprogram)) << " " << str << ")\n";
        release_resources(gsg);
      }
    }

As we can see from the error message that is actually printed (see above) str seems to point to an empty string. Thats the first sign, that something is wrong here. (I assume GLP(GetString) is a macro returning a pointer to glGetString). Is that error string maybe reset from one of the later Cg-Calls (shouldn’t be, nothing there seems to affect OpenGL directly).

Obviously some error occours within Cg, because cgGetError returns something != CG_NO_ERROR. But to me it seems like this error has nothing to do with the above cgGLLoadProgram call. What is, if an error occours before cgGLLoadProgram and is not cleared by calling cgGetError? I will put some work into this and hopefully I can fix the issue.

I posted this primarily for personal logging purposes, but maybe some of you has other ideas what might cause this issue.

Best regards
Socke

If it compiles fine with the “glslv” and “glslf” profiles, then why don’t you have “//Cg profile glslv glslf” in your shader?

Doesn’t matter what profiles I set there, the error message stays the same. Actually I’ve changed it to glslv/glslf during my last tests but forgot to metion here.

Well after some thougs on the problem (I haven’t tried poking around in the code because I don’t know how to install panda locally in the users home directory) I had the following idea:

According to the glsl specs[1] gl_InstanceID is first present in GLSL 140 which is OpenGL 3.1. But with the Intel Driver on Linux OpenGL 3.1 features (and therefore GLSL 140) are only present if version 3.1 is requested at context creation[2]. I searched a little in the source files but didn’t find the place, where the OpenGL context is requested.

My questions are:

  1. Where is the context created in pandas source code?
  2. Is it possible to set the version panda requests there?
  3. Does panda rely on any GL_ARB_compatibility features e.g. anything that was removed in the core profiles? Or in other words: Would it be problematic requesting a version 3.1 or version 3.3 context (which then has none of the fixed function stuff anymore).

Regards,

Socke

References:
[1] opengl.org/sdk/docs/manglsl/ … anceID.xml
[2] mesa3d.org/relnotes/9.0.html

There’s no need to enable the core OpenGL 3.1 profile in order to use GLSL 1.40. Just put an appropriate #version line at the top of your GLSL file, this should be sufficient.

But to answer your questions: it is created in the GraphicsStateGuardian class of the appropriate implementation, and it wouldn’t work - Panda3D uses functions from OpenGL 2 and will most likely not run in OpenGL 3 core profile mode.

Well the Intel driver on Linux requires to create a core profile to use functions above OpenGL 3.0 including GLSL 140 (as stated in [1]). I’ve tested that in C some time ago, it’s really true. When panda does not work with a core profile, instancing will not be possible on these cards :frowning:

I’ll have a look at the source to see what I can do. I think enabling a core profile is the only way, instancing (and other future stuff) is going to work on Linux with Intel cards (or patching the Intel driver, which may be much harder…).

It’s strange that they would design the driver in such an incompatible manner, and I must say that I’ve never seen this before. I don’t recall running into this issue with my Intel card, but that was on Windows.

Yes, it’s strange. The Windows driver seems to be quiet good, it even seems to support OpenGL 4.x, but the Linux driver is somewhere behind, only supporting OpenGL 3.3 (since a month or so, most of 2013 we just had ogl 3.1) and not a single compatibility feature. I tried creating a context with glfw (the interface is way too creepy to use it without a wrapper library like glfw) and when requesting Version 3.3 and not using a core profile, this simply fails. When not requesting a 3.x context, it is simply not possible to use any glsl shaders above #version 130. Sad but true. (Actually it’s a nice “feature”, since it keeps a lot of unnecessary work away from the driver developers and forces game developers not to use deprecated OpenGL features, but well… it breaks compatibility).

Regards