[SOLVED] Get actual sounds listened by one or more listeners

david_ragazzi · May 10, 2017, 6:23pm

Sorry I didn’t get it. Do you mean what @wezu suggested? Calculate the new spectrum over the original spectrum based in the distance between the listener and the sound source?

serega-kkz · May 10, 2017, 6:27pm

I’m thinking of adding a message to the sound in a hidden audio range. Something like Morse code, for a bot. With information about sources and type.

wezu · May 10, 2017, 7:38pm

When you play a sound using the Audio3DManager you get different volume for each channel (ear) when the sound is playing from some 3d position, you don’t need 2 listeners for that. If you want to record the sound with some 3rd party software then you don’t need anything else. If you want to get the numeric values for the sound played - I don’t think that’s exposed to python and it would be simpler to write some custom code - I also don’t think the 3d audio is physically accurate, I suspect sound travels instantly, passes walls and the listener has no ears just a forward vector

serega-kkz · May 11, 2017, 11:33am

I by nature hear only the left ear, and to determine the source, I need to turn my head.

If you have access to the L and R channel levels from the python, you can make a radar.

You need in the radar algorithm, to achieve for example the maximum sound in L and the minimum in R, this will mean that you (radar) are pivoted with the left ear to the source.

david_ragazzi · May 11, 2017, 3:55pm

Thank you guys, you helped a lot. The next question is: how get the sound ouput of L And R channels in real-time? I believe that there is a way to get the ouput of the speakers but probably it’s tricky because other sounds ouf the game are also being emmited.

serega-kkz · May 11, 2017, 5:02pm

You are moving in the direction of in-depth sound analysis.

I do not think that you can repeat the human brain for a bot, the intellect will be like phannson88 and ponchhaiya22x

The best way is to implement your sound manager, who will know where the sound is.

serega-kkz · May 11, 2017, 7:20pm

3D sound in the game, it’s just an imitation, for this you need two parameters, volume and panning

from panda3d.core import loadPrcFileData
loadPrcFileData("", "audio-library-name p3fmod_audio")

from direct.gui.DirectGui import *
import direct.directbase.DirectStart

#base.disableMouse ()

pivot = render.attachNewNode('pivot')

point_audio = loader.loadModel('teapot')
point_audio.setScale(0.05)
point_audio.reparentTo(pivot)

head = loader.loadModel('box')
head.setScale(0.4)
head.reparentTo(render)

mySound = loader.loadSfx('sound.wav')
mySound.setLoop(True)
mySound.play()

def setPan():
    mySound.setBalance(slider_pan['value'])
    pivot.setHpr(-slider_pan['value']*90, 0, 0)

def setDist():
    point_audio.setPos(pivot, 0, slider_dist['value'], 0)
    mySound.setVolume(1 - (slider_dist['value']/10))

slider_pan = DirectSlider(range=(-1, 1), pos=(0, 0, -0.8), value = 0, command=setPan)
slider_dist = DirectSlider(range=(0, 10), pos=(base.getAspectRatio()-0.1, 0, 0), value = 2, command=setDist, orientation= DGG.VERTICAL)

base.run()

david_ragazzi · May 12, 2017, 11:32am

Yeah… But still this not solve the original question: how capture the actual sound (frequency and amplitude) reaching each ear (channel) of the head. I need these numeric values or the sound sample emmited at a given frame.

I’ve read about PyJack (https://github.com/umlaeute/pyjack) which is a python bindings to JACK (http://www.jackaudio.org/):

(If I understood well), basically the game could generate sounds as a server and get the same sounds (separated by channels) as a client.

Here a tutorial about make this:
https://turion.wordpress.com/2013/01/08/little-pyjack-tutorial/

serega-kkz · May 12, 2017, 11:53am

You confuse a panda3D, with a spectrograph.

You must first implement it, then think about how to get a sound recording with panda3d.

Theoretically, you can make an asynchronous record from each listener, but think that are not suitable for calculations. Thus, the cycle will switch from one ear to another.

In fact, in network games, not the sound but the position of the sound is transmitted. This is so by the way.

You can alternately change the point (ear) to record sound, you can try to use multi-core, but are not sure of the exact result. Example: multiple audio3d listeners

With this library, you can get data from the microphone, I think, and the application can be written.
docs.python.org/3/library/audioop.html

The book has an example: Lang C. - Panda3D 1.7 Game Developers Cookbook - 2011

david_ragazzi · May 12, 2017, 1:55pm

I already have some scripts about make graphs based on samples. This is not my concern. My only doubt that still was not answered is: how get the actual sound? I already have some insights on how get these channels output using PyJack, but simpler solutions are welcome.

serega-kkz · May 12, 2017, 2:04pm

And in your opinion at the output sound card is not sound from the program, or how?

You can run two applications, one for L, another for R, and synchronizes them with a special server.

david_ragazzi · May 12, 2017, 2:11pm

Yes, man… I already said this: that I need get the L and R channels sounds reaching the speakers. The question is HOW??

Nevermind I’ll try by own using PyJack…

serega-kkz · May 12, 2017, 2:25pm

Panda repetition is not an audio editor. And why should you do this Panda3D. Example of getting data from the audio output: I cited docs.python.org/3/library/audioop.html

serega-kkz · May 12, 2017, 7:03pm

youtu.be/k-5V1HjoZQQ

david_ragazzi · June 3, 2017, 1:33am

To whom may interest… Finally I managed to get the spectrum of the actual 3D sound reaching each ear! For this I had to use directly the FMOD library through pyfmodex (https://github.com/tyrylu/pyfmodex/tree/master/pyfmodex) - a very nice python bindings to FMOD library.

The code bellow is an example showing in real-time the 3D panning of a sound passing in front of your head from left to right and then from rigth to left. You will note that when the sound is going to the left side, the bars of the spectrum chart representing the right ear gradually disappear once the volume in this ear is getting low.

I tested with only one listener, once the FMOD documentation says that some DSP effects are disabled when using multiple listeners to avoid confusion (https://www.fmod.org/docs/content/generated/overview/3dsound.html):

Split screen / multiple listeners

In some games, there may be a split screen mode. When it comes to audio, this means that FMOD Studio has to know about having more than 1 listener on the screen at once. This is easily handled via System::set3DNumListeners and System::set3DListenerAttributes.

If you have 2 player split screen, then for each ‘camera’ or ‘listener’ simply call System::set3DListenerAttributes with 0 as the listener number of the first camera, and 1 for the listener number of the second camera. System::set3DNumListeners would be set to 2.

When using the low-level, 3D channels have the following behaviour:
- It turns off all doppler. This is because one listener might be going towards the sound, and another listener might be going away from the sound. To avoid confusion, the doppler is simply turned off.
- All audio is mono. If to one listener the sound should be coming out of the left speaker, and to another listener it should be coming out of the right speaker, there will be a conflict, and more confusion, so all sounds are simply panned to the middle. This removes confusion.
- Each sound is played only once as it would with a single player game, saving voice and cpu resources. This means the sound’s effective audibility is determined by the closest listener to the sound. This makes sense as the sound should be the loudest to the nearest listener. Any listeners that are further away wouldn’t have any impact on the volume at this point.

The code to get left and right spectrums (you must have PyQt or PySide):

import sys
import pyfmodex
from pyfmodex.constants import FMOD_SOFTWARE, FMOD_LOOP_NORMAL, FMOD_3D
from PyQt5 import QtWidgets, QtGui, QtCore

LEFT_CHANNEL = 0
RIGHT_CHANNEL = 1

FMOD_DSP_FFT_WINDOW_RECT = 0


class FrequencyAnalysis(QtWidgets.QWidget):

    def __init__(self, fmod, sound):
        QtWidgets.QWidget.__init__(self)

        self.init_ui()

        # Normalization toggle and sample size
        self.fmod = fmod
        self.sound = sound
        self.enable_normalize = False
        self.sample_size = 64

    def init_ui(self):
        self.picture = QtWidgets.QLabel(self)
        self.picture.setScaledContents(True)

        layout = QtWidgets.QHBoxLayout()
        layout.setContentsMargins(0, 0, 0, 0)
        layout.addWidget(self.picture, 0, QtCore.Qt.AlignCenter)

        self.setLayout(layout)
        self.setWindowTitle("FMOD 3D Frequency Analysis")
        self.setSizePolicy(QtWidgets.QSizePolicy(QtWidgets.QSizePolicy.Fixed, QtWidgets.QSizePolicy.Fixed))
        self.resize(1600, 350)
        self.show()

    def keyPressEvent(self, event):
        key = event.key()
        print(key)

        # Toggle pause
        if key == QtCore.Qt.Key_P:
            self.sound.TogglePause()

        # Toggle normalization
        if key == QtCore.Qt.Key_N:
            self.enable_normalize = not self.enable_normalize

        # Decrease FFT sample size
        if key == QtCore.Qt.Key_1:
            self.sample_size = max(self.sample_size / 2, 64)

        # Increase FFT sample size
        if key == QtCore.Qt.Key_2:
            self.sample_size = min(self.sample_size * 2, 8192)

    def paintEvent(self, event):
        qp = QtGui.QPainter()
        qp.begin(self)
        qp.fillRect(self.rect(), QtCore.Qt.black)

        # Find frequency range of each array item
        hz_range = (44100 / 2) / float(self.sample_size)

        # Draw display
        qp.setPen(QtCore.Qt.white)
        qp.setFont(QtGui.QFont("Verdana", 8.))
        qp.drawText(10, 10, "Press P to toggle pause, N to toggle normalize, 1 and 2 to adjust FFT size")
        qp.drawText(10, 30, "Sample size: " + str(self.sample_size) + "  -  Range per sample: " + str(hz_range) + "Hz")

        def draw_spectrum(title, channel, start_x):

            # Get spectrum for the channel
            spec = self.fmod.get_spectrum(self.sample_size, channel, FMOD_DSP_FFT_WINDOW_RECT)

            # Find max volume
            max_vol = max(spec)

            # Normalize
            if self.enable_normalize and max_vol != 0:
                def normalize(db):
                    return db / float(max_vol)
                spec = [normalize(db) for db in spec]

            # Draw display
            qp.setPen(QtCore.Qt.white)
            qp.setFont(QtGui.QFont("Verdana", 8.))
            qp.drawText(start_x + 10, 70, title)
            qp.drawText(start_x + 10, 80, "Max vol this frame: " + str(max_vol).format("0.000"))

            # Get painter dimensions
            width = (self.rect().width() / 2)
            height = self.rect().height()

            # VU bars
            block_gap = 4 / (self.sample_size / 64)
            block_width = int((float(width) * 0.8) / float(self.sample_size) - block_gap)
            block_max_height = 220

            # Left-hand X co-ordinate of bar, left-hand Y co-ordinate of bar, width of bar, height of bar (negative to draw upwards), paintbrush to use
            for b in range(self.sample_size - 1):
                rect = QtCore.QRect(start_x + int(width * 0.1 + (block_width + block_gap) * b),
                                    height - 20,
                                    block_width,
                                    int(-block_max_height * spec[b]))
                gradient = QtGui.QLinearGradient(rect.topLeft(), rect.bottomRight())  # Diagonal gradient from top-left to bottom-right
                gradient.setColorAt(0, QtCore.Qt.green)
                gradient.setColorAt(1, QtCore.Qt.red)
                qp.fillRect(rect, gradient)

        # Draw the spectrums perceived by each ear
        draw_spectrum("LEFT EAR", LEFT_CHANNEL, start_x=0)
        draw_spectrum("RIGHT EAR", RIGHT_CHANNEL, start_x=self.rect().width() / 2)

        qp.end()


def main():

    def change_listener(listener):
        current_listener.position = listener
        fmod.update()

    # FMOD initialization
    fmod = pyfmodex.System()
    fmod.init()

    # Load the sound
    sound1 = fmod.create_sound("sine.wav", mode=FMOD_LOOP_NORMAL | FMOD_3D | FMOD_SOFTWARE)

    # Play the sound
    channel = sound1.play()
    channel.volume = 0.7
    channel.min_distance = 50
    channel.max_distance = 10000  # Need this for sound fall off

    # Create listeners positions
    listener1 = (0, 0, 0)
    listener2 = (0, 10, 0)

    # Create a listener in the center of the scene
    current_listener = fmod.listener(id=0)
    change_listener(listener1)

    # Open the form
    app = QtWidgets.QApplication(sys.argv)
    fa = FrequencyAnalysis(fmod, sound1)

    # Walk the sound around your head
    global x, min_x, max_x, inc
    min_x = -30
    max_x = 30
    sound_pos = (max_x, 3, 0)
    x = min_x
    inc = 1
    
    def tick():
        global x, min_x, max_x, inc
        if x == min_x:
            inc = 1
        elif x == max_x:
            inc = -1
        x += inc
        channel.position = [x, sound_pos[1], sound_pos[2]]
        print("Playing at %r" % str(channel.position))

        # Update FMOD
        fmod.update()
        fa.repaint()
        
    timer = QtCore.QTimer()
    timer.timeout.connect(tick)
    timer.start(100)

    app.exec_()
    sys.exit()

if __name__ == "__main__":
    main()

david_ragazzi · June 5, 2017, 2:38am

To simply the straight use of FMOD library over Panda default audio 3D, I created a custom Audio3DManager with methods similar to the Panda’s original class. This way, someone could re-use this class without many drastic changes in the current code.

The only thing to observe are the methods playSound and stopSound implemented by me. You should use them instead of sound.play() or sound.stop() because the former ones store the channels which will be used by the class to update channel positions.

class Audio3DManager:

    def __init__(self, world, listener_target=None):
        self.world = world
        if listener_target:
            self.listener_target = listener_target
        else:
            self.listener_target = self.world.render

        ## FMOD initialization
        self.fmod = pyfmodex.System()
        self.fmod.init()

        ## Create a FMOD listener based on Panda target object
        self.listener = self.fmod.listener(id=0)
        self.listener.velocity = (0, 0, 0)
        self.listener_velocity_auto = False

        ## Dictionary of objects and their sounds attached
        self.object_sounds = {}

        ## Dictionary of sounds and their properties (velocity, velocity_auto, min_distance, max_distance)
        self.sound_properties = {}

    def attachListener(self, object):
        """
        Sounds will be heard relative to this object. Should probably be the camera.
        """
        self.listener_target = object

    def detachListener(self):
        """
        Sounds will be heard relative to the root, probably render.
        """
        self.listener_target = self.world.render

    def getListenerVelocity(self):
        """Get the velocity of the listener."""
        return self.listener.velocity

    def setListenerVelocity(self, velocity):
        """
        Set the velocity vector (in units/sec) of the listener, for calculating doppler shift.
        This is relative to the sound root (probably render).
        Default: VBase3(0, 0, 0)
        """
        self.listener.velocity = velocity
        self.listener_velocity_auto = False

    def setListenerVelocityAuto(self):
        """
        If velocity is set to auto, the velocity will be determined by the
        previous position of the object the listener is attached to and the frame dt.
        Make sure if you use this method that you remember to clear the previous
        transformation between frames.
        """
        self.listener.velocity = (0, 0, 0)
        self.listener_velocity_auto = True

    def loadSfx(self, name, mode=FMOD_LOOP_NORMAL|FMOD_3D|FMOD_SOFTWARE):
        """
        Load a sound with 3D positioning enabled.
        """
        sound = self.fmod.create_sound(name, mode)
        props = {
            "channel": None,
            "last_position": None,
            "velocity": (0, 0, 0),
            "velocity_auto": False,
            "min_distance": 1.,
            "max_distance": 1000000000.}
        self.sound_properties[sound] = props
        return sound

    def playSound(self, sound, paused=False):
        channel = sound.play(paused=paused)
        self.sound_properties[sound]["channel"] = channel
        return channel

    def stopSound(self, sound):
        self.sound_properties[sound]["channel"].stop()
        self.sound_properties[sound]["channel"] = None

    def attachSoundToObject(self, sound, object):
        """
        Sound will come from the location of the object it is attached to.
        """
        self.detachSound(sound)  ## If the sound already is attached to other object, detach it
        if not object in self.object_sounds:
            self.object_sounds[object] = []
        self.object_sounds[object].append(sound)

    def detachSound(self, sound):
        """
        Sound will no longer have it's 3D position updated.
        """
        for object, sounds in self.object_sounds.iteritems():
            if sound in sounds:
                self.object_sounds[object].remove(sound)
                break

    def getSoundsOnObject(self, object):
        """
        Returns a list of sounds attached to an object
        """
        return self.object_sounds[object]

    def getSoundVelocity(self, sound):
        """
        Get the velocity of the sound.
        """
        return self.sound_properties[sound]["velocity"]

    def setSoundVelocity(self, sound, velocity):
        """
        Set the velocity vector (in units/sec) of the sound, for calculating doppler shift.
        This is relative to the sound root (probably render).
        Default: VBase3(0, 0, 0)
        """
        self.sound_properties[sound]["velocity"] = velocity
        self.sound_properties[sound]["velocity_auto"] = False

    def setSoundVelocityAuto(self, sound):
        """
        If velocity is set to auto, the velocity will be determined by the
        previous position of the object the sound is attached to and the frame dt.
        Make sure if you use this method that you remember to clear the previous
        transformation between frames.
        """
        self.sound_properties[sound]["velocity"] = (0, 0, 0)
        self.sound_properties[sound]["velocity_auto"] = True

    def getSoundMinDistance(self, sound):
        """
        Controls the distance (in units) that this sound begins to fall off.
        Also affects the rate it falls off.
        Default is 3.28 (in feet, this is 1 meter)
        """
        return self.sound_properties[sound]["min_distance"]

    def setSoundMinDistance(self, sound, dist):
        """
        Controls the distance (in units) that this sound begins to fall off.
        Also affects the rate it falls off.
        Default is 3.28 (in feet, this is 1 meter)
        Don't forget to change this when you change the DistanceFactor
        """
        self.sound_properties[sound]["min_distance"] = dist

    def getSoundMaxDistance(self, sound):
        """
        Controls the maximum distance (in units) that this sound stops falling off.
        The sound does not stop at that point, it just doesn't get any quieter.
        You should rarely need to adjust this.
        Default is 1000000000.0
        """
        return self.sound_properties[sound]["max_distance"]

    def setSoundMaxDistance(self, sound, dist):
        """
        Controls the maximum distance (in units) that this sound stops falling off.
        The sound does not stop at that point, it just doesn't get any quieter.
        You should rarely need to adjust this.
        Default is 1000000000.0
        """
        self.sound_properties[sound]["max_distance"] = dist

    def getDistanceFactor(self):
        """
        Control the scale that sets the distance units for 3D spacialized audio.
        Default is 1.0 which is adjust in panda to be feet.
        """
        raise NotImplementedError()

    def setDistanceFactor(self, factor):
        """
        Control the scale that sets the distance units for 3D spacialized audio.
        Default is 1.0 which is adjust in panda to be feet.
        When you change this, don't forget that this effects the scale of setSoundMinDistance
        """
        raise NotImplementedError()

    def getDopplerFactor(self):
        """
        Control the presence of the Doppler effect. Default is 1.0
        Exaggerated Doppler, use >1.0
        Diminshed Doppler, use <1.0
        """
        raise NotImplementedError()

    def setDopplerFactor(self, factor):
        """
        Control the presence of the Doppler effect. Default is 1.0
        Exaggerated Doppler, use >1.0
        Diminshed Doppler, use <1.0
        """
        raise NotImplementedError()

    def getDropOffFactor(self):
        """
        Exaggerate or diminish the effect of distance on sound. Default is 1.0
        Valid range is 0 to 10
        Faster drop off, use >1.0
        Slower drop off, use <1.0
        """
        raise NotImplementedError()

    def setDropOffFactor(self, factor):
        """
        Exaggerate or diminish the effect of distance on sound. Default is 1.0
        Valid range is 0 to 10
        Faster drop off, use >1.0
        Slower drop off, use <1.0
        """
        raise NotImplementedError()

    def getSpectrum(self, sample_size, channel, mode=FMOD_DSP_FFT_WINDOW_RECT):
        return self.fmod.get_spectrum(sample_size, channel, mode)

    def disable(self):
        """
        Detaches any existing sounds and removes the update task.
        """
        channels_playing = [self.sound_properties[sound]["channel"]
                            for sound in self.sound_properties
                            if self.sound_properties[sound]["channel"] and
                               self.sound_properties[sound]["channel"].is_playing]
        for channel in channels_playing:
            channel.stop()
        self.object_sounds = {}
        self.sound_properties = {}

    def update(self, time_per_frame):
        """
        Updates position of sounds in the 3D audio system. Will be called automatically in a task.
        """
        if self.fmod.channels_playing > 0:
            def calculate_velocity(last_position, curr_position):
                """
                Calculate velocity based on last and current positions
                """
                distance = numpy.array(curr_position) - numpy.array(last_position)
                time = 1  ##TODO: Check if time_per_frame is the corret time
                velocity = list(distance / time)
                return velocity

            ## Update listener properties
            self.listener.position = self.listener_target.getPos()
            if self.listener_velocity_auto and self.last_listener_position:
                self.listener.velocity = calculate_velocity(self.listener.position, self.last_listener_position)
            self.last_listener_position = self.listener.position

            ## Update sounds properties
            sounds_playing = [sound
                              for sound in self.sound_properties
                              if self.sound_properties[sound]["channel"] and
                                 self.sound_properties[sound]["channel"].is_playing]
            for sound in sounds_playing:
                props = self.sound_properties[sound]
                channel = props["channel"]

                ## Set sound position to the same which the it is attached
                object = [object for object, sounds in self.object_sounds.iteritems() if sound in sounds][0]
                channel.position = object.getPos()

                ## Set sound velocity
                if props["velocity_auto"] and props["last_position"]:
                    props["velocity"] = calculate_velocity(channel.position, props["last_position"])

                channel.velocity = props["velocity"]
                channel.min_distance = props["min_distance"]
                channel.max_distance = props["max_distance"]
                props["last_position"] = channel.position
                self.sound_properties[sound] = props

            ## Once 3D objects are updated, update the FMOD system
            self.fmod.update()