Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Quest to build a speech synthesis/voice transformer thingy  (Read 1726 times)

Muz

  • Bay Watcher
    • View Profile
Quest to build a speech synthesis/voice transformer thingy
« on: January 15, 2010, 11:28:07 am »

I'm making an emotion transformer for my final year thesis. If it works as expected, I'm pretty sure I could turn it into an actual app. Since you guys seem interested in this Science stuff, thought I'd just post it here.. and having some audience gives me enough motivation and pressure to keep working on this :P

Besides, I might need a little help with changing it into an application. Prototyping is easy enough, but making it into something most people can use takes a bit more effort.

Emotion transformer
Basically, it's to insert an emotion into a neutral speech signal. My suggested approach is to take out the phoneme thingy, get the pitch, modify pitch/loudness/speech rate/whatever, then reassemble it into a more extreme emotion. I have no idea how it'll turn out, but hopefully, it'd work.
- LPC extraction (done): I managed to remove the filter, and reinsert it, so you could get all the "ay", "ee", "ahh" back assembled, even if the pitch is different
- Pitch detection: This is a bit of a bitch. It's simple in theory, just get the first frequency, but well, pitches are like ocean waves, you can't really tell which is the biggest first wave.
- Pitch modification: Hell, this is simple enough in theory, but the mathematics make my eyes cry, then bleed. I'm probably going to do it just in theory, invent my own formula, and hope they match up.
- Other stuff: Assuming I get the detection and modification, this part should be a breeze.

Gonna be prototyped in MATLAB for now. Not real time either.


Speech synthesis app
Building up from similar techniques, I could actually make some software that synthesizes speech at a certain pitch. Pronunciation is all messed up, but since everything is digital, it could be a really small app, less than 1 MB.
- Filter library: I'd have to find real life phonemes and get a whole list of filters for every phoneme anyone could pronounce.
- Simulate pitch: Just a simple algorithm that creates a pitch.. shouldn't take too long.
- Speech synthesis!: Combine the two together and you get... phonemes. Well, if you combine a few phonemes together you get some mess of a word.
- Emotion: The Microsoft Bob voice is so boring. Even if my robot is speaking like a retard, I want it to speak like an angry/depressed one.


Voice morpher
Hah, now moving another step forward, I could possibly even build that into a full voice morpher/transformer. Your voice in one end, different one out the other end. It's basically going to be like the first one, but a whole lot more crap coded into C++.
- Development of a LPC algorithm in C++: Linear prediction is painful :(
- Development of correlation, filters, etc: More hurt.
- Efficiency: I'd probably be tempted to be sloppy, but hey, with some efficient use of memory management, processing time would drop lower than it would for images. Which means.. real-time use. So much fun :3
- Connecting the inputs and outputs to stuff, opening file formats, etc
- Avoiding a lawsuit and other legal stuff.
Logged
Disclaimer: Any sarcasm in my posts will not be mentioned as that would ruin the purpose. It is assumed that the reader is intelligent enough to tell the difference between what is sarcasm and what is not.

eerr

  • Bay Watcher
    • View Profile
Re: Quest to build a speech synthesis/voice transformer thingy
« Reply #1 on: January 15, 2010, 11:59:12 am »

What is your goal, exactly?
Logged

Armok

  • Bay Watcher
  • God of Blood
    • View Profile
Re: Quest to build a speech synthesis/voice transformer thingy
« Reply #2 on: January 15, 2010, 02:30:19 pm »

If this works, it's awesome and also something I could really use.
For a look into the closest to this that I can think of having alredy been done, google Praat I think.
Logged
So says Armok, God of blood.
Sszsszssoo...
Sszsszssaaayysss...
III...

Muz

  • Bay Watcher
    • View Profile
Re: Quest to build a speech synthesis/voice transformer thingy
« Reply #3 on: January 16, 2010, 12:09:53 pm »

Yeah, Praat is pretty awesome. Praat's a bit better at doing the stuff, but it depends.


What is your goal, exactly?

To make an newbie-friendly speech thingy. For one thing, you could type in stuff, and it'd say it, in different voices and tones (albeit awkwardly).

Oh, and you can morph your voice into something more dwarven. Or a girl's voice. Not that you should XD
Logged
Disclaimer: Any sarcasm in my posts will not be mentioned as that would ruin the purpose. It is assumed that the reader is intelligent enough to tell the difference between what is sarcasm and what is not.

alfie275

  • Bay Watcher
    • View Profile
Re: Quest to build a speech synthesis/voice transformer thingy
« Reply #4 on: January 16, 2010, 05:44:40 pm »

What would be cool is if you had a translator built into it.
Logged
I do LP of videogames!
See here:
http://www.youtube.com/user/MrAlfie275

Dwarf

  • Bay Watcher
  • The Light shall take us
    • View Profile
Re: Quest to build a speech synthesis/voice transformer thingy
« Reply #5 on: January 16, 2010, 05:56:26 pm »

What would be cool is if you had a translator built into it.

That's a completely other cup of tea/can of worms though.
Logged
Quote from: Akura
Now, if we could only mod Giant War Eagles to carry crossbows, we could do strafing runs on the elves who sold the eagles to us in the first place.

bjlong

  • Bay Watcher
  • [INVISIBLE]
    • View Profile
Re: Quest to build a speech synthesis/voice transformer thingy
« Reply #6 on: January 16, 2010, 06:26:49 pm »

Can't you use an FFT for the pitch detection? Or would that just pick up the phoneme's features?
Logged
I hesitate to click the last spoiler tag because I expect there to be Elder Gods in it or something.

winner

  • Bay Watcher
    • View Profile
Re: Quest to build a speech synthesis/voice transformer thingy
« Reply #7 on: January 16, 2010, 06:32:51 pm »

Can't you use an FFT for the pitch detection? Or would that just pick up the phoneme's features?
you should be able to pick up or reconstruct the fundamental pretty easily.
Logged
The great game of Warlocks!