threading problems

muhma · February 6, 2010, 9:47am

Hi, I’m working with a bunch of students for an university project. For our prototype it would be usefull to use python. And so we have two problems using the python/panda threadings.

First of all a little code example:

import time

#from threading import Thread, Event
from direct.stdpy.threading2 import Thread, Event 
from pandac.PandaModules import Thread as PandaThread #@UnresolvedImport

class MyThread1(Thread):
    def __init__(self):
        Thread.__init__(self)

    def run(self):
        print "1"
        time.sleep(20)
        print "11"
        
class MyThread2(Thread):
    def __init__(self):
        Thread.__init__(self)

    def run(self):
        print "2"
        time.sleep(0.5)
        print "22"

def main():
    #PandaThread.considerYield()
#    test_event = Event()
    print "Support panda threading: " , PandaThread.isThreadingSupported()
    print "0"
    thread1 = MyThread1()
    thread2 = MyThread2()
    print "00"
    
    thread1.start()
    print "000"
    thread2.start()
    print "0000"
    
    time.sleep(4)
    print "00000"

if __name__ == "__main__":
    main()

And this is the console output:

Support panda threading:  1
0
00
1
11
000
2
22
0000
00000

This isn’t what i would expect. If I change the imports to “from threading import Thread” and use the python threadings, then it works fine.
Console output with python threading:

Support panda threading:  1
0
00
1
000
2
0000
22
11
00000

This is it, as it should be. But why not with the build-in panda threadings, which should be preferred? And if I change to “from direct.stdpy.threading import Thread”, the threads don’t start!

The second problem also considers the difference between the python and the panda threading:
If I use a panda thread, called A, besides the panda task (for rendering and so on), the frame rate dropps of 50% (from 60 to ~30).

Here an example of the run method of the thread A:

def run(self):
   while not self.__stopped:
      PandaThread.considerYield()
      print time.clock()
   time.sleep(self.__dtime)

The usage of the python theads solves the frame rate problem. But where is the problem in the panda threading module?

Thanks

b1g4l · February 6, 2010, 5:44pm

I think what’s happening is that when you call:

time.sleep(0.5)

it’s blocking all code execution. In other words - it’s sleeping all threads, not just the one you called from. The order of the print statements suggests this.

Are you using a standard build of Panda or have you re-compiled for true threading? What OS are you on?

drwr · February 6, 2010, 6:03pm

Right, use Thread.sleep(0.5) (or, in your example above, PandaThread.sleep(0.5)) instead of time.sleep(0.5) to sleep just the current thread.

Also note that Panda threads will switch context only on one of the following calls:

mutex.acquire() or release(), or related synchronization primitive operation

Thread.sleep()

task completion or yielding

an explicit call to Thread.considerYield() or Thread.forceYield()

So, if any of your threads engage in a long operation without calling any of the above functions, you should insert a call to Thread.considerYield() somewhere in the middle of the function to get called from time to time, to allow your other threads to run.

This is all necessary because Panda’s “simple threads” module is a cooperative threading environment; threads are not preempted based on a timer. This is what allows the threading model to run so much more efficiently than a true preemptive threading model, because Panda doesn’t have to protect low-level operations like reference count adjustments.

David

muhma · February 7, 2010, 10:08am

Thank you a lot for your quick responses. Yes the issue was the timer. Now I see, that I mixed up the python and panda thread environments. So both problems are solved

@ b1g4l: As far as I know, the problems occur on Win XP/Win7 (x32 and x64). Linux was not tested. And we use atm the standart 1.7 build of panda.

And for further concerns (sorry for off-topic): I think we will use a lot of threads in our projekt, especially for network and event management. Can you tell something about the speed up, or the benefits (as meant in the manual: “…unless you fully make use of the benefits that threading gives.”) of using the full threading implementation?

drwr · February 7, 2010, 12:41pm

Threading is a complex subject. Its naive use rarely gives any speed-up at all; usually, its use results in an overall performance penalty.

The penalty comes from the additional overhead of assuring protection from race conditions and deadlocks. This kind of protection requires additional work at every level of code, including very low-level operations that are performed over and over again (particularly reference-count manipulations).

You can gain a performance benefit from threading only if you successfully isolate and parallelize different parts of your code, so that they can run on different cores simultaneously, and their results later unified. This is not possible with Python code, since the Python interpreter is single-threaded by design; so it can only be done for custom C++ code.

There exists some half-finished code within Panda to achieve this kind of parallelization for the actual rendering code. When this code is completed, it is expected that the time to draw the frame will be reduced substantially. However, this effort still has some work to go.

In the meantime, there is a whole science of multiprogramming and multiprocessing techniques to study. Colleges give semester-long courses on the subject; graduate students write papers on it. If you do wish to tackle performance optimization through parallelization, there is much information out there to help you.

David

adr · February 7, 2010, 1:54pm

From what I gather is you only see a slow down do to I/O blocking. While Threading gets around this, this doesn’t mean the code is running any faster unless you do the multiprocessing techniques.

So what would be the differents from the above exampe to just using:

taskMgr.setupTaskChain('chain_name', numThreads = 2 #<-

drwr · February 7, 2010, 5:33pm

The task chain system is a more convenient way to spawn tasks on one or more sub-threads. Rather than starting and stopping threads explicitly to serve your tasks, you can simply add your tasks to a threaded task chain. Depending on your application design, this may or may not be convenient. But, yes, you could have written the above example using the task chain interface instead of with the explicit threading interface.

In Panda’s simple_thread model, this is generally true, though there is still some additional overhead for the thread context switches that means you will get a (very slight) performance degradation simply for using threads at all.

In true threading, where Panda is compiled to allow threads to run in parallel on multiple cores, there is considerably more overhead, even if you never use threads. There is even more overhead if you do use threads. The thread slowdown I’m referring to is largely due to this overhead. If your application is suited to it, you can design your algorithms to run in parallel, and if you are successful at this, you can compensate for the thread overhead, and hopefully come out ahead at the end of the game. This may be challenging to do, though. There’s a reason that multiprogramming is not used universally.

David