Multi-threading

This isn’t completely a Panda question, but since there have been a couple threads about client/server systems, I hope it fits in:

For a server with multiple clients, is a multi-threaded solution the best? If so, should the thread handler be in a separate file to keep the memory footprint down? If not, what would be preferable to multi-threading? A round-robin style socket servicing?

Thanks in advance!

–Joe

I personally detest multi threading because its hard to get right. I feel message passing is much easer to think about and debug.

Most of my server/clients use message passing via select sockets(i guess you call this “ound-robin style socket servicing”).

If i need to fill multi core CPU with stuff i just start more of them each in their own python process.

Note python does not do well on multi core because of the GIL.

One part were multi threading is a must is hardware layer like GPU/CPU/cores which must delver performance not ease of programming. I think there is threading support in panda3d now that does some cool stuff in this area even by passing python’s GIL (being C++) to do some speedups.

interesting, I never knew this was an issue at all in python. So let’s say I want to do an app that’ll take advantage of multiple cores through multiple processes:

  • one for simulation code
  • one for scenegraph manipulation
  • one motion tracking process
  • one network process

how would that work?

I’m coming from SGI Performer which by default would split everything up into an Application, Culling, and Drawing process. (I never used that for everything though because it’s only after performer was killed off that I had a machine with more than one CPU! ) so maybe that’s not even the best model to use.

I write protocol stack testers in Python using the asyncore module for network control and use WxPython as the front end for the application. My packet generator can saturate a 100Mb link at over 96% and a 1000Mb link at over 50% without multithreading.

And this is on a Dell 600 running Windows xp. CPU utilization is down in the weeds of a couple of %.

The key to performance with asyncore is to call asyncore.loop() often enough on a consistant time slice. I use the wx.Timer to generate timer events as fast as possible.

When using Panda, i give asyncore it’s chance to run by interleaving it between frames as follows:

# Limit the Frame rate.
globalClock.setMode(ClockObject.MLimited)
globalClock.setFrameRate(25)

while True:
	taskMgr.step();

	# Give the sockets some processing time.
	asyncore.loop(count=100);

This allows the app to process a max of 2500 packets per second at 25 frames per second.

In tests unfettered by frame rate limits i can ping pong UDP frames between hosts at rates over 5000 frames / second.

In summary, using modern hardware with efficient ethernet drivers shouldnt require multithreading to achieve decent performance.

David,

wasn’t there a push at one time to make the C++ stuff run in separate threads from python? So that python code, GPU drawing, and node munging could live in different threads?

benchang,

You are optimizing before you have stats. In most cases all those operation could be done on one core and you would still have 90% of the core free - as is the case in my game.

The real bottle neck is in waiting on GPU, sending data to the GPU and walking the scene graph on C++ side. All of this is on C++ side so theoretically panda3d can take all the scary thread stuff away from you and be fast and you would not have to worry. And i think this was the goal.

Right, this has always been the goal, and Panda has been designed internally with this in mind from the beginning. We are close to that goal now but not quite there yet. Incidentally, this is almost precisely the App/Cull/Draw model from Performer–the original Panda developers were old-hand Performer users too, and this is still a promising model even on today’s hardware.

David