Audio/Music visualizer....possible in Panda3D?

Liquid7800 · May 14, 2007, 2:57pm

Hi, you know those cool audio file/waveform “visualizers” one sees in Windows media player and the like, when you play music?..well is it possible in Panda3D to build one of those or is there certain types of commands needed to do some thing like this (I.e.: special sound or audio functions?)
Hopefully it is possible because that would be very neat!

Thanks!

enn0x · May 14, 2007, 5:13pm

Hmm… I think you will need direct access to a 2D device context. Since such a player will have more GUI elements than just the visualization panel I would try to use wxPython’s wx.GraphicsContext and wx.GraphicsPath. They wrap GDI+ on Windows, Cairo on GTK and CoreGraphics on OS X, and provide an advanced 2D API.

I think PCM (pulse code modulation) representation of audio data will be best for analyzing and generating effects. But I don’t know if Panda3D has ways to get such data.

Anyway, having a look at the source code of this open source visualization plugin might give you valuable information. It uses xmms for audio and SDL for graphics:

http://infinity-plugin.sourceforge.net/index.php

enn0x

EDIT: using SDL or Allegro might be faster than GDI+, but I don’t know any decent Python wrappers for SDL or Allegro, besides PyGame which AFAIK uses SDL.

ynjh_jo · May 14, 2007, 6:29pm

I found this on the forum :
discourse.panda3d.org/viewtopic.php?t=2438

Liquid7800 · May 14, 2007, 7:40pm

Thanks to you both for the direction…I start it up and share as I get going.

—Thanks

raytaller · May 14, 2007, 8:21pm

Those visualizers are in most of cases graphical effects based on the sound spectrum.

Look at this post : discourse.panda3d.org/viewtopic.php?t=2727
This is a classical feedback effect. You could start by rotating the torus knot according to the low frequencies of the sound.

Getting the spectrum only consist in a FFT operation, what is pretty common, though discouraged to implement in pure python.
I’m sure FMOD does that very well, but also OpenAL or the C++ lib FFTW (for a python binding, look at pylab.sourceforge.net/)

Laurens · May 14, 2007, 8:32pm

I think you can do it with pygame (to load the sound into an array) and numpy (numpy.fft can do the fourier transformation to find the spectrum).

I made a small test, but I have no time to test it further and I did not look at what the negative frequencies are that numpy.fft gives. So for what it is worth:

import pygame
import pygame.mixer
import pygame.sndarray

import numpy
from numpy.fft import fft

# Initialize mixer
pygame.mixer.init()

# Load a sound
sound = pygame.mixer.Sound('/home/laurens/video/pvo_dvd/smaller/Myra08.wav')

# Put it in an array
a = pygame.sndarray.array(s)
a = numpy.array(a)
left = a[:,0]
right = a[:,1]

# Get playback frequency
nu_play, format, stereo = pygame.mixer.get_init()
# Time resolution of frequency analysis (s)
sample_length = 0.1
sample_num = int(sample_length*nu_play)

# n to frequency
frequency = numpy.arange(sample_num/2)/sample_length

# Get first bin, left channel
spectrum = fft(left[:sample_num])
positive = spectrum[1:1+sample_num/2]
power = (positive*positive.conj()).real

frequency and power are what you will probably want.
Would be cool if you could make a panda model dance to music .

Liquid7800 · May 15, 2007, 6:08am

Nice experiment Laurens! Also And thanks for the links too.

I have 2 questions though:

Are FFT’s the only type of algorithms able to analyse the sound spectrum?

Can a real-time Fast Fourier Transform extract the component frequencies & drive the 3D graphics, lighting and camera according to set methods and functions?..or would this be too much for python to handle?

Thanks for the great info.

Arkaein · May 15, 2007, 5:48pm

FFT is the basic method to use for sound analysis. Essentially, it takes a short sound clip (an array sampling amplitudes over time) and converts it into an array showing spectrum strength (an array sampling amplitudes across all frequencies).

FFT should be plently fast when done on very short sound samples. Because numpy is coded in C (I would guess) it should be plenty fast, since the heavy lifting isn’t done in Python.

For something like this I’m not sure if there are any alternative methods that would make sense, since FFT does exactly what you’re looking for.

Laurens · May 15, 2007, 8:47pm

I couldn’t resist and turned that test code into a few classes:

import pygame
import pygame.mixer
import pygame.sndarray

import numpy
from numpy.fft import fft
from math import log10

# Initialize mixer
pygame.mixer.init()

class SoundSpectrum:
    """
    Obtain the spectrum in a time interval from a sound file.
    """

    left = None
    right = None
    
    def __init__(self, filename, force_mono=False):
        """
        Create a new SoundSpectrum instance given the filename of
        a sound file pygame can read. If the sound is stereo, two
        spectra are available. Optionally mono can be forced.
        """
        # Get playback frequency
        nu_play, format, stereo = pygame.mixer.get_init()
        self.nu_play = 1./nu_play
        self.format = format
        self.stereo = stereo

        # Load sound and convert to array(s)
        sound = pygame.mixer.Sound(filename)
        a = pygame.sndarray.array(sound)
        a = numpy.array(a)
        if stereo:
            if force_mono:
                self.stereo = 0
                self.left = (a[:,0] + a[:,1])*0.5
            else:
                self.left = a[:,0]
                self.right = a[:,1]
        else:
            self.left = a

    def get(self, data, start, stop):
        """
        Return spectrum of given data, between start and stop
        time in seconds.
        """
        duration = stop-start
        # Filter data
        start = int(start/self.nu_play)
        stop = int(stop/self.nu_play)
        N = stop - start
        data = data[start:stop]

        # Get frequencies
        frequency = numpy.arange(N/2)/duration

        # Calculate spectrum
        spectrum = fft(data)[1:1+N/2]
        power = (spectrum*spectrum.conj()).real

        return frequency, power

    def get_left(self, start, stop):
        """
        Return spectrum of the left stereo channel between
        start and stop times in seconds.
        """
        return self.get(self.left, start, stop)

    def get_right(self, start, stop):
        """
        Return spectrum of the left stereo channel between
        start and stop times in seconds.
        """
        return self.get(self.right, start, stop)

    def get_mono(self, start, stop):
        """
        Return mono spectrum between start and stop times in seconds.
        Note: this only works if sound was loaded as mono or mono
        was forced.
        """
        return self.get(self.left, start, stop)

class LogSpectrum(SoundSpectrum):
    """
    A SoundSpectrum where the spectrum is divided into
    logarithmic bins and the logarithm of the power is
    returned.
    """

    def __init__(self, filename, force_mono=False, bins=20, start=1e2, stop=1e4):
        """
        Create a new LogSpectrum instance given the filename of
        a sound file pygame can read. If the sound is stereo, two
        spectra are available. Optionally mono can be forced.
        The number of spectral bins as well as the frequency range
        can be specified.
        """
        SoundSpectrum.__init__(self, filename, force_mono=False)

        start = log10(start)
        stop = log10(stop)
        step = (stop - start)/bins
        self.bins = 10**numpy.arange(start, stop+step, step)

    def get(self, data, start, stop):
        """
        Return spectrum of given data, between start and stop
        time in seconds. Spectrum is given as the log of the
        power in logatithmically equally sized bins.
        """
        f, p = SoundSpectrum.get(self, data, start, stop)

        bins = self.bins
        length = len(bins)
        result = numpy.zeros(length)
        ind = numpy.searchsorted(bins, f)
        for i,j in zip(ind, p):
            if i<length:
                result[i] += j
        
        return bins, result

To use this, load a new sound like:
s = LogSpectrum(‘mysound.wav’)

and get the spectrum for a small time interval between start and stop (in seconds):
f,p = s.get_left(start, stop)

or get_right for the right channel. f are the frequencies and p is the log of the power. This should be tested a bit more and actually visualized in a cool way .

raytaller · May 15, 2007, 11:28pm

Great
True, such elegant and concise codes are irresistible, I hesitated to give a try in implementing this 3 times today, but I wouldn’t have done better
Too bad I started to like maths when my short studies where finished.

Liquid7800 · May 16, 2007, 4:04pm

EXCELLENT example Laurens!

I agree really clean code. Also thanks for the info too…I would have really never realized what could analyze audio spectrums.

@raytaller:

Yeah me too! but after seeing the cool work you’ve done then perhaps there is hope for me to become a creative programmer too!

Laurens · May 17, 2007, 8:50am

Thanks! Yeah, for some reason they only teach the boring basics and not the cool applications of math.

Laurens · May 17, 2007, 10:34am

Now I had to see if it works . This is a simple visualizer:

#!/usr/bin/env python
#
# A spectrum visualizer.

import direct.directbase.DirectStart
from pandac.PandaModules import CardMaker
from direct.task import Task
from waveform import LogSpectrum

class Visualizer:
    """
    Show a spectrum visualizer and update it
    regulary.
    """
    
    playing = False
    filename = None
    last = 0
    taskname = 'Visualizer'
    paused_time = 0

    # Minimum time between updates. If this is too short, your
    # spectrum will lack low frequencies.
    dtmin = 0.04

    # This visualizer is only mono.
    force_mono = True

    # Scale the height of the spectrum
    scale = 1e-13
    
    def __init__(self, filename=None, bins=20):
        """
        Create a new Visualizer instance. If a filename is
        provided, a sound will be loaded. Optionally the number
        of spectral bins (number of elements in the visualization)
        can be given.
        """
        self.bins = bins
        self.create_visualization()

        if filename:
            self.filename = filename
            self.load(filename)
    
    def create_visualization(self):
        """
        Setup the visualization.
        """
        bins = self.bins

        elements = []
        cm = CardMaker('Visualizer')
        cm.setColor(1, 1, 1, 1)
        for i in range(bins):
            cm.setFrame(-0.01, 0.01, 0, 0.1)
            card = aspect2d.attachNewNode(cm.generate())
            card.setPos(-1 + 2./bins*i, 0, -0.5)
            elements.append(card)

        self.elements = elements

    def update(self, task):
        """
        Update the visualization.
        """
        # current = self.sound.getTime() Only returns integer value!!
        current = task.time + self.paused_time # hack
        last = self.last
        if current-last >= self.dtmin:
            self.last = current
            f,p = self.spectrum.get_mono(last, current)
            self.update_visualization(f, p)

        return Task.cont 

    def update_visualization(self, frequency, power):
        """
        Update the visualization given a spectrum. This is
        called by update().
        """
        power = power*self.scale/self.dtmin
        for e,p in zip(self.elements, power):
            e.setScale(1, 1, p)
    
    def load(self, filename):
        """
        Load a new sound from filename.
        """
        if self.playing:
            self.stop()
        
        self.filename = filename
        self.spectrum = LogSpectrum(filename, self.force_mono, self.bins)
        self.sound = loader.loadSfx(filename)
    
    def play(self, paused=False):
        """
        Start playing the sound and the visualization.
        """
        self.playing = True
        self.sound.play()
        if not paused:
            self.last = 0
        self.sound.setTime(self.last)
        taskMgr.add(self.update, self.taskname)
    
    def stop(self):
        """
        Stop playing the sound and the visualization.
        """
        self.playing = False
        self.sound.stop()
        taskMgr.remove(self.taskname)

    def pause(self):
        """
        Toggle playing/pausing the sound and visualization.
        """
        if self.playing:
            self.paused_time = self.last
            self.stop()
        else:
            self.play(True)

if __name__ == '__main__':
    import sys
    from direct.gui.DirectGui import *
    title = OnscreenText("Panda3D: spectrum visualizer demo",
                         style=3, fg=(1,1,1,1), pos=(0.8,-0.95), scale = .07)

    base.accept("escape", sys.exit)

    v = Visualizer('/home/laurens/sound/test.wav')
    base.accept('space', v.pause)
    v.play()
    
    # Main loop
    run()

I found that sound.getTime() returns an integer value, so I used the time of the task.
You can easily make visualizations using this class by overwriting
create_visualization and update_visualization.

Liquid7800 · May 17, 2007, 3:50pm

Dang! This looks awesome. I hope this question isnt too elementary.

When I try to load up your most recent post I get an error message in the DOS window:

No module named waveform

Is there something I need to install in one of the bins? I am pretty sure it has to do with:

from waveform import LogSpectrum

Hopefully I am on the right track since I’m excited to try this out!
UPDATED: Response:
Sorry, unless I am wrong I think I have to have the classes (including Pygame) installed/saved first in the same directory with your visualizer code (the latest code) I am going to try that now…thanks and sorry for the question…I am new to this so I am slowly putting these OOP concepts into practice outside of my books.
Thanks

Laurens · May 17, 2007, 9:40pm

Yes, you need to install pygame and numpy first. You can get them from pygame.org and numpy.scipy.org.

I saved the two classes from the earlier post (SoundSpectrum and LogSpectrum) in a file called waveform.py, that’s why I do “from waveform import LogSpectrum”. I have uploaded the files with the code from my posts:
phys.uu.nl/~keek/visualizer.py
phys.uu.nl/~keek/waveform.py

visualizer.py is the one you can run, but you have to change the filename of the sound file that is loaded to your own file.

Liquid7800 · May 18, 2007, 1:57pm

Thanks for the explaination and posts. This is going to help a whole lot!

stuaxo · February 22, 2008, 12:04am

Hi,
The vis class is very cool - I’ve turned it into a networked vis, but I can’t work out how to make the window not appear.

I get this error:

self.sound = loader.loadSfx(filename)

NameError: global name ‘loader’ is not defined

Is there any way to just use the sound loader part and not load the rest?

treeform · February 22, 2008, 12:08am

from pandac.PandaModules import loadPrcFileData
loadPrcFileData("", “window-type none”)

before any thing else will not make you a window

stuaxo · February 22, 2008, 7:30pm

Cheers; I’ll give it a go tonight: my next question is whether it’s possible to use it with a stock python 2.5 (well… could settle for stock 2.4, but would prefer 2.5)

I’m on windows for the moment, but in the downloads there only seems to be a “full fat” version that includes python.

(Ideally I’d really like to just use the audio parts in a stock 2.5 python).

Josh_Yelon · February 22, 2008, 7:56pm

99% of panda3d is devoted to 3D graphics. If all you want is audio, you’re probably better off with pygame or the like.

The python that comes with Panda3D is a plain old copy of python. In the latest daily build of panda, it’s python 2.5. It differs only in that it contains a file “panda.pth” which tells it where to find the panda DLLs. If you put a similar “panda.pth” file into any other copy of python, then that copy of python will be able to use panda as well.