Iterating over 600-odd objects

I discovered the following issue while attempting to put together a simple particle-system mechanism using MeshDrawer, believe, but my main concern is for my current project itself, which I fear may give rise to similar issues.

I’ve read that Python can be slow when iterating over large datasets - but what constitutes “large”? In attempting to build a particle mechanism (mentioned above), I’ve hit significant slowdown when attempting to run a very large system, holding at a given time around 600 particles. PStats seems to confirm that the main issue is in my update code.

Have I hit performance issues native to Python, or is this a lack of power on my computer’s part? My machine is a netbook running a dual-core 1.5GHz Intel Atom N550, I believe, with Ubuntu being the operating system.

To be honest, 600 elements doesn’t seem to me to be a terribly large dataset…

(For reference, I’m not yet sure of how many objects my main game is likely to have per “level”, but it could well be a fair few. If this does prove to be a problem I do already have it in mind to only update within a limited region at any given time and to perhaps limit the level size even more than I had intended, but it remains a worry.)

Iterating over elements and performing some task on them is simply going to cost the number of elements multiplied by how long each element takes to process (n * t). You can expect that 600 elements will take twice as long as 300 elements. This is assuming you are not doing something that is going to increase the work exponentially like comparing every element to every other element (n * n * t).

If you iterate over 600 elements but do nothing to them I think you’ll find that it takes very little time in comparison. You should break down your update code into smaller parts and again use pstats or timers to isolate exactly which part is taking the majority of the time. Sometimes you might be surprised at which part of the code is taking the most time.

I’m pretty confident that I have done just that; indeed, I seem to recall that at one point I had commented out almost all of the update code for individual particles, leaving only the timer update, an if-statement (which contained only “pass”) and an else that deactivated the particle once its timer had run.

Note that this did seem to help: I think that my numbers had a system with properly updating particles running at around 13-15 frames per second, and the “nearly-empty update” version running at around 25 frames per second (this in an otherwise empty scene, I believe).

It really does seem to simply be the sheer number of particles.

(I’m pretty sure that my system should be O(n), at least with a single system in place as I’m currently testing, although I haven’t thought it through properly.)

Something else to try would be writing the update code in c/c++ and then wrapping that into python, that should speed things up a little bit.

other then that it might be your processor. Have you tried running the code on a faster processor?

Also consider how you are getting the elements that you are iterating over. Do you have a list stored (which would be fast) or are you using some function that returns the elements (which might be slow).

I have considered this, but if feasible I’d rather not pursue it, simply due to preference for Python and not having put together such a “wrapped system” before. If worst comes to worst with regards to my main game I might reconsider, but for the particle system I’m more likely to either look again at the built-in system, look for another somewhere, or try to find another solution.

I might do that, although I’d hoped to leave the one better computer where I am at the moment for later tests without Panda installed (to improve my chances of catching missing files in my installation). There is another computer elsewhere that I might use, however.

(I still find it surprising to think that 600 objects should be too much for a 1.5GHz processor, however… :/)

I’m iterating over a list, which should be fast, I believe. There are a few levels in the call stack involved, but I tried removing at least one of them, to little effect, as I recall.

The overall structure is something like the following. The actual code is, I believe, a little more complex, but it should give you an idea of what’s going on. (There’s actually a pool of particles from which spawning particles are taken and to which spent particles are returned, but that doesn’t seem to be my main problem at the moment.)

class Controller():
    def __init__(self):
        self.systems = [] # There should be only one at the moment.

    def update(self, dt):
        # Again, there should be only one system at the moment
        for system in self.systems:
            system.update(dt)

class System():
    def __init__(self):
        self.particles = [] # Around 600 of these

    def update(self, dt):
        for particle in self.particles:
            particle.update(dt)

class Particle():
    def __init__(self):
        pass

    def update(self, dt):
        # Code here.  Quite a bit of time seems to be
        # spent in this code

Would you mind posting a part of your update method? The rest seems to be fine.

By the way, it is conceivable that Python is the reason your code is so slow. For 25 FPS and 600 particles, update() gets called 25 * 600 = 15000 times per second. Each call requires various stack manipulation by the interpreter and that together with whatever branches you have in your code may be slow. (You are not doing any object initialization during update(), are you? If you are, that is likely to be the reason for your code being so slow.)

Bottomline: Its possible that its actually not your code’s fault. Python is absolutely not intended to write a real-time particle engine in it. Maybe you would consider writing that part in C/C++?

Here you go, along with a few other pieces; I’ve deleted some portions for brevity.

class MeshParticleController():
    def __init__(self, """other params"""):
        self.particleSystems = []
        
        # Further initialisation here...
        
        self.updateTask = taskMgr.add(self.update, "update")
        self.updateTask.lastTime = 0
        
        self.time = 0
    

    def update(self, task):
        if task.lastTime == 0:
            dt = 0
        else:
            dt = task.time - task.lastTime
        task.lastTime = task.time
        
        self.meshDrawer.begin(base.cam, self.root)
        
        for system in self.particleSystems:
            system.update(dt, self.meshDrawer)
        
        self.meshDrawer.end()
        
        self.time += dt
        
        return Task.cont

class MeshParticleSystem():
    
    def __init__(self, """other params here"""):
        
        # Other initialisation here...

        # Particles are created in __init__ and
        # added to particlePool, from which they
        # are taken as called for and placed in
        # particlesInUse, to be returned once spent.
        self.particlePool = []
        self.particlesInUse = []
        # Particle creation code here...
    
    def update(self, dt, meshDrawer):
        for particle in self.particlesInUse:
            particle.timer += dt
            if particle.timer < particle.lifeSpan:
                pass
                # This would normally have particle update
                # code; it has been removed to better get
                # an idea of how quickly it runs without the
                # additional code, such as Vector additions.
            else:
                particle.active = False
            
            if particle.active:
                particle.draw(meshDrawer)
            else:
                self.particlePool.append(particle)
        self.particlesInUse = [x for x in self.particlesInUse if x.active]
        
        # Particle spawning handled here...

class MeshParticle():
    def __init__(self, lifespan, frameList):
        # Initialisation...

    def draw(self, meshDrawer):
        meshDrawer.particle(self.position, self.frame, self.size,
                            self.colour, self.rotation)
    

The above seems to run (admittedly with two other applications - Firefox included - running) at about 19 frames per second. With the particle update code reinstated it runs at about 15 frames per second, I believe.

There shouldn’t be any initialisation other than updates to the lists used for the particle pool and “active particle list”. Particle spawning involves popping a particle, setting various of its members and pushing it onto said “active particle list”. That said, I seem to recall that my PStats tests should have isolated the particle update section, and that alone seems significantly slow, so my list manipulation is unlikely to be a sole culprit.

(There is also, by the way, an OnscreenText that should be being updated every few frames, but that doesn’t seem to slow things down by more than a few frames per second.)

As to C++, as mentioned above, I have considered it, but would prefer against it if feasible. With regards the particle system, I’m more likely to look for another solution (such as using a pre-existing particle system or avoiding lots of particles) - again as mentioned above, I’m rather more concerned for my main game, since I don’t yet know how many objects I’m likely to have per “level” (I’m hoping for a fair bit of interactivity).

(Hopefully I’ll soon be in a good position to start testing the main game on such matters.)

I’m not sure if this will work but if python structure traversal that’s causing the problem then maybe store your sprites in a simpler structure.

Perhaps use numpy to hold an array of particles where each particle value like pos, vel, etc are elements in the array. Then you can traverse at higher speeds.

Numpy can’t hold the meshdrawer. But maybe it could index into a python array of meshdraw objects…

Anyway just a thought. Numpy has fast array manipulation and traversal capability.
But it means going ‘old school’ on your datastructures…

Hmm… Interesting.

I don’t think that I’ve used “Numpy” before. If it can handle multi-dimensional arrays it might just work. (I’m thinking of an array of systems, each holding an array of particles, each being an array of particle data, all initialised at startup and simply accessed as called for using a set of indices to indicate which are active…)

Thank you for that suggestion - I may well try that out. :slight_smile:

numpy took a little getting used to but supports multi-dimensional arrays, as well as sparse arrays, very well.

things like pop are expensive but numpy solutions like resize are cheap.

I have been very impressed with its speed. Check out the new opencv (cv2) which uses numpy exclusively for all image and array calcs. Its very fast.

I guess one had to take a closer look at what you are actually doing. What does 600 particles mean? egg-texture-cards with frames? just geometry scaling? Why 600 anyway - 600 explosions going on?

Your code is fine, so I assume that Python is the root of the issue. However, I do not think that using numpy will resolve the issue. While it’s true that a numpy array is slightly faster than normal object traversal, the difference is not order of magnitudes, i.e. you would still be unable to spawn multiple particle systems. I guess there really is no other way of going about this than switching to C / Cython or going the proper (or at least more “modern”) way and move the whole thing to the GPU using a geometry shader (which is as blazingly fast as it is complicated).

EDIT:

Particles are normally drawn as simple geometry, yes. And 600 particles means oneexplosion that consists of that number of particles and not 600 explosions. You should read up on how particle systems work.
[/quote]

Hmm… Fair enough, and thanks.

For particle systems I’ll probably leave it for now and look for a new solution at a later stage - either revisiting Panda particles, looking for another implementation elsewhere or looking for a new way of going about what I intend.

For my main game I suppose that I’ll see how it goes. As I think that I mentioned above, I have a few refinements in mind that might help if it comes to that.

In this case it means a single particle system - a long, roughly linear fire rather than an explosion, in this case.

As for “why 600”, it’s simply a largish number - if I recall correctly, I was expecting rendering to be my main problem with so large a system, and, wanting to test it, simply put together a single, large system that could cover a reasonably large section of the screen and produce a large number of particles to draw. This ended up stabilising at around 600 particles at a given moment.

As to the drawing method, I’m using MeshDrawer, and the “particle” method in particular, I believe.