realtime lipsync?


I’m wondering if there is a python module which is good for realtime lipsync. In the end the module should recognize from an audio stream phonemes which then could trigger the appropriate morphtargets.

Does somebody know of something like that?


well detecting phonems is a sience for itself.i dont think there is a python library for it. what you can do is simply rotating a jaw-joint depending on the volume so your avatar opens and closes his mouth. it’s proven to work quite well and the result is ok for most things. if you really need realtime detection of phonems… good luck. maybe you can try your luck with speech->text algorithms or so. not impossible but definetly not easy either.

ah, alright, well, thanks anyway

Do you have a static set of audio files, that you want to detect the phonemes from? Or is it something like in-game VoIP ?

Because, if you just have a set of audio files for the game, you could pre-compute the phonemes for each audio file and write them into a file with time codes. When you then load and play that sound, you can load your phoneme file with it…

well, a real actor should control the character/avatar in realtime, so I was searching for a realtime lisync solution, e.g. like this:

but in the end it’s too expensive right now and the result aren’t that convincing, so I skipped it.