Threaded Web Request for Data Visualization

b1g4l · November 12, 2009, 6:18pm

I’ve written a data visualizer that pulls fresh data off the internet every 5 minutes and displays this data within the 3D Panda environment.

Initially, I had the update code running as a doMethodLater task that just repeats every 5 minutes. Since the visualization is always moving (think animated tag cloud) you would see the graphics freeze when the update method was running. Although the internet request is usually complete is less than a second, I want to get rid of the graphics freezing when updating the data. In this update method I’m not doing anything Panda specific (no Panda library calls). I’m simply building a list of the data returned from an XML request.

In the future I also will want to move this closer to real-time data either through the use of long-polling or pushing data to the application. I’ve read all the documentation as well as any forum posts talking about threading, and I’m thinking this will be the way to go. I’ve tried something similar to what you see below, but it still seems that the graphics freeze during the update. I would appreciate any ideas on how I could implement this.

I’ve really simplified the code here to show my method of implementing threaded updates. This code seems to be leaking memory (possibly creating infinite new threads?) as well as not fixing the graphics freezing issue.

<Other Panda Imports>
from direct.stdpy import threading2 as threading

class UpdateThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)        
        self.myData = []

    def run(self):
        #Here I make an internet request and process the returned data    



class MyVisualization(DirectObject):
    def __init__(self):
        self.updateObj = None
        self.myData = []
        
        #Repeating task (every 5 seconds) to kick off update thread
        self.updateTask = taskMgr.doMethodLater(5, self.update, 'update task')

    def update(self, task):
        if self.updateObj is not None:
            if not self.updateObj.isAlive():
                self.updateObj = UpdateThread()
                self.updateObj.start()
        else:
            self.updateObj = UpdateThread()
            self.updateObj.start()        
        return task.again

drwr · November 12, 2009, 6:43pm

I’m guessing you ran into a similar error as discussed in this thread. You don’t show the mechanism you are using to poll the web server, but if it isn’t Panda’s native mechanism (HTTPClient), for instance if you are using urllib or something else, then it will block all of the threads while it runs.

Your three choices are the same as the ones discussed in the linked thread: either switch your getter to non-blocking mode (not possible with urllib, but possible with PyCurl), switch to using Panda’s own http getter (HTTPClient), or recompile Panda yourself to use true threading.

Actually, you have a fourth option, since you’re not using any Panda calls in your thread: you could use Python threads (import threading) instead of Panda threads, for that thread only. But I don’t recommend this option, as you’d have to be very careful never to make any Panda calls in there, and that includes running the Python garbage collector, which might try to destruct Panda objects and cause a crash.

David

b1g4l · November 13, 2009, 4:17pm

David, thanks for the information - it’s very informative. I’m using the feedparser library to fetch and parse the xml, which does use urllib.

I’ve switched to using pythons threading, and that seems to do the trick. I never plan to make any Panda calls in this thread as it’s only purpose is to retrieve and prepare the data. I’ll be stress testing to make sure this doesn’t cause any unusual side-effects.

Thanks again.
Alex