Render on-demand to numpy array - AI Reinforcement Learning


I’m building a small simulation for training an AI (DQN, if anyone is interested).

  • I’m looking for a way to feed it a numpy array formatted image as input
  • I’m looking for a way to render on demand so that I can progress the simulation in repeatable intervals, regardless of computer resources.

I’ll post my complete code here once I’ve got everything working together nicely.

Thanks everyone.



Sounds interesting! The input image would then be a framebuffer from a camera of your simulation? If so, you can use memoryview to access such data directly and make a np array out of it.

Not sure where exactly you need help right now, could you elaborate where you’re stuck?


…and BTW I’m very interested in AI in general and am looking forward to what comes of your efforts


“If so, you can use memoryview to access such data directly and make a np array out of it.”
Thanks for the advice!

“Not sure where exactly you need help right now, could you elaborate where you’re stuck?”
My second thing was a way to call RenderAllTheThingsNow() type function instead of letting take care of everything. So far I get how a regular game uses app.taskMgr.add() to add functions to do between renders but I’d like a little more control so that I could, for instance, reset the scene or something.
I’m striving for a setup where my simulation could be stand alone, consisting of a few basic functions:

  • init(): basic setup of the scene, returns state information (image)
  • reset(): Re-setup of the scene for the things that have changed, returns state
  • step(action): Advances the simulation one step using whatever actions the AI did, returns state info, reward (score) and whether the game is “over”

So far I’ve gotten to this tutorial and I’m able to get it to work:
But I don’t know where to go from here.


You’re not actually bound to use anything other than a scene graph, that can start at an arbitrary NodePath you define.

For the physics part you can define how much gets updated by calling bullet_world.do_physics(deltatime, ...), where deltatime is whatever time scale you want to pick per tick/frame, so you’re not bound to do anything in realtime. Add a camera to your scene, render it to an offscreen buffer and access said buffer directly through a memoryview as stated earlier.

I don’t think you need the task manager, since you can run the simulation in sequential steps such as:
do_physics -> memoryview of buffer -> DQN stuff -> repeat

To save and restore the scene graph, I believe that you could save the scene graph (NodePath) to .bam format every step or so (rather wasteful, otherwise store position and orientation of every NodePath recursively)… Though I think there are more versed people on this forum than me, that could steer you in the right direction.

This part of the manual might also be of interest:

Also, there’s a Discord and irc with many helpful people that can help you navigate to the best solution.

As a side note: regarding the frame buffer data you plan to use in the DQN, it might be useful to use a depth buffer instead of a color image since this gives your AI distance information for free, next to contours on a grayscale.


I am not entirely sure if forcing a render is the right way to go, but to do so you can call render_frame() on a GraphicsEngine instance such as ShowBase.graphicsEngine.


Thanks for all of the advice.
Is there a resource that shows how to make a GraphicsEngine object manually?


Thanks for the suggestion about the depth. Unfortunately depth sensors are out of my budget range (this AI will be operating a small simple robot).

Who knew a game engine could have industrial applications?


I do not know of a resources that shows how to make a GraphicsEngine object manually. Fortunately, it is not difficult. You will need a GraphcisPipe object, which you can get from GraphicsPipeSelection. If I recall correctly, building a GraphicsEngine object looks something like this:

import panda3d.core

pipe = panda3d.core.GraphicsPipeSelection.get_global_ptr().make_default_pipe()
engine = panda3d.core.GraphicsEngine(pipe)

From there you’ll need a window/buffer. An example of that can be found here.

Depending on your needs, it you may still want to use ShowBase, which will create the GraphicsEngine for you. You can have ShowBase skip creating a window by setting window-type none in a PRC file/data.


Thanks a ton, Moguri.

So I’ve put all of this together and it works pretty well except I’m leaking memory pretty badly. If I comment out graphicsEngine.renderFrame() it stops leaking. I even tried using ShowBase to setup the GraphicsEngine without luck.

Moguri, you mentioned that “forcing a render” may not be the right way to go. Are there any other options?


If you have a small example demonstrating the memory leak, I suggest filing an issue on GitHub. As for better options, I am not sure. I would probably try to figure out why the regular render loop doesn’t work for you and see if there are options to adjust it.


Here is my Github issue, if anyone is interested:


As a heads up, there is an offscreen option for window-type. This will let ShowBase create the Window, DisplayRegion, and Camera for you.

You seem to be using a hybrid of using ShowBase, but manually forcing renders. There are probably other things (e.g., taskMgr) that need to be run as well. Can you put your OpenCV stuff into a Task and let ShowBase run the main loop?


Thanks, I’ll try it out.

I can, but I much prefer not to in order to make this “MyApp” object importable into both my AI training and testing programs. Thanks for the suggestion though! :slightly_smiling_face:

First I’ll try an upgrade to 1.10.2 and see if that fixes it, as rdb suggested.


Here is an offscreen version that lets ShowBase create the window, displayregion and camera.

It is better (0.8 mb per 1k steps vs 3 mb) but it still suffers the same issue: leaking memory when rotating the spot light (or the block relative to it).


Just tested this and it appears to stop leaking:

So at least there’s that.


That makes me wonder if there is something else we need to kick when rendering. Maybe @rdb has an answer to that?


Found it!
They are both normally called in a ShowBase loop.
Adding them stopped the leak.


And here’s the final version, as promised. Thanks everyone.