Partial texture array update

eldee · June 7, 2021, 9:07pm

I’m working on improving the performance of my app, and right now on improving the drawing time. In order to render a surface, I need dozen if not hundreds of textures, one per tile displayed in the most simple configuration, but it could grow up to half a dozen.

To avoid binding hundreds of textures per frame, I will store them in a texture array, each rendered tile will have an index allocated in that array. As I can not have a texture array containing the data for the trillions of tiles of a planet, I need to assign and deassign the tiles dynamically and so update frequently some pages of the array.

Before starting the implementation, I want to know if the following partial update are possible and supported :

Update a page in a texture array stored only on the GPU with data from the CPU (usually a texture loaded from disk)
Update a page in a texture array stored only on the GPU with data from the GPU (A texture baked using a render to texture pipeline)
Update a page in a texture array stored on the GPU and the CPU with data from the GPU (A data texture generated from a render to texture pipeline (or in the future a compute shader))
Update a page in a texture array stored on the GPU and the CPU with data from the CPU

From what read in the doc and forum, 1 and 2 should work out of the box; 1 with a simple texture page update (I only hope that the full texture array is not send again to the GPU when a single page is updated), and 2 using a layered FBO and gl_layer. Probably 4 should work nicely like 1.

I’m more concerned about 3 where I need to combine RTM_bind_layered mode with RTM_copy_ram. A possible work around would be to generate the texture individually on the GPU, copy it to RAM as before and use technique 1 to update the texture array on the GPU.

rdb · June 8, 2021, 2:20pm

If you update texture data on the CPU, even just part of an image, Panda will still reupload the entire texture array. However, this feature has been requested a couple of times and I would be willing to implement this. Please ensure that there is a feature request for this in GitHub. This should make 1 and 4 work nicely.

2 should be possible without issues.

In theory, it may be possible to use a second add_render_texture on the same slot with RTM_copy_texture or RTM_copy_ram in theory, though I am not quite sure how this will work out. We could fix this if it doesn’t behave as one might reasonably expect.

If other options fail, you could consider using a compute shader to perform an arbitrary texture copy as desired.

eldee · June 8, 2021, 6:59pm

Thank you for the clarification, I guess it was wishful thinking on my part I created the feature request on GitHub :

github.com/panda3d/panda3d

Add partial update of texture array

opened 06:58PM - 08 Jun 21 UTC

closed 03:06PM - 04 Jul 21 UTC

el-dee

enhancement

## Description Right now, when a user updates only a subset of the pages of a… texture array on the CPU side, Panda3D will transfer again the whole texture array to the GPU. Instead it should only transfer the modified pages to the GPU (as allowed by OpenGL) in order to save bandwidth. Though I believe it should be still possible to use the full transfer if the user think the overhead of the per page transfer is greater than transferring the whole array in one go. ## Use Case This is useful when the texture array is used as a local cache in the GPU only a subset of a whole dataset (e.g. for mega or virtual textures which are too big to reside even in high end GPU), or when the array is used as an atlas with the texture generated on demand on the CPU. See this discussion thread for more details : https://discourse.panda3d.org/t/partial-texture-array-update/27735

Good, I will start with that one.

I will test it in a couple of days and report my finding.

That’s indeed an option, but as last resort. macOS still does not support compute shaders (unless you use carbon…). On the other hand if there is no other solution, I will make use of them on Linux and Windows, and have slightly worse performances on Mac.