Performance with many models in the scene

Here’s a runnable piece of test code which draws a somewhat irregular rotating spiral, composed of small spheres. The spheres shrink and fade out as they move outwards, until they dissapear. At given settings (SpiralBlobber instantiation), there are about 1000 spheres in the spiral at any time on the screen. Here’s the task that updates the spiral (which is most of the code in the in the archive):

    def loop (self, task):
        if not self.alive:
            return task.done

        dt = task.time - task.prev_time
        task.prev_time = task.time
        return task.cont
    def _update_spiral (self, dt):
        # Remove expired blobs.
        mod_blobs = []
        for node, age, drift in self._blobs:
            if age < self._duration:
                mod_blobs.append((node, age, drift))
        self._blobs = mod_blobs
        # Create new blobs to replace expired.
        self._time_to_next -= dt
        while self._time_to_next <= 0.0:
            node = self._model.copyTo(self._topnode)
            age = -self._time_to_next - dt
            drift = random.uniform(-0.05, 0.05)
            self._update_pos(node, age, drift)
            self._blobs.append((node, age, drift))
            self._time_to_next += self._period
            #print len(self._blobs)
        # Move blobs.
        mod_blobs = []
        for node, age, drift in self._blobs:
            age += dt
            agefrac = age / self._duration
            node.setAlphaScale(1.0 - agefrac)
            scale = self._scale1 + agefrac * (self._scale2 - self._scale1)
            self._update_pos(node, age, drift)
            mod_blobs.append((node, age, drift))
        self._blobs = mod_blobs
    def _update_pos (self, node, age, drift):
        radspeed = self._radspeed * (1 + drift)
        tanspeed = self._tanspeed * (1 + drift)
        rad = radspeed * age
        if rad < self._initrad:
            ang = 0.0
            age0 = self._initrad / self._radspeed
            ang = (tanspeed / radspeed) * math.log(age / age0)
        node.setPos(rad * math.cos(ang), 0.0, rad * math.sin(ang))

On an i7 at 3.33 GHz and Radeon 5870, I get about 25 fps with this code. Here’s also the screenshot from pstats:

I wonder if performance could be improved somehow? (I have no prior experience at 3D programming.)

I’ve read the performance tuning section in the manual, but could not speed the code up. E.g. if I remove the two state changes on each sphere (node.setAlphaScale(…) and node.setScale(…)), indeed the time in pstats “App” section drops by about factor two, but that also changes the result. Or, each default sphere is 80 polygons (“sphere80” in the code), and if I use a sphere with 1280 polygons (“sphere1280”, also in the linked archive) indeed I get no further performance decrease, so I guess the problem is in (high?) number of models in the scene rather than total number of polygons.

Graphics cards really can’t handle a large number of mesh batches. A card can render a million polygon object with no problem, but have a million one polygon objects and it will grind to a halt. A good target is to have 300 mesh batches on screen at any time.

Here is the section of the manual on it: … any_Meshes