🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Text output

Started by
6 comments, last by Sammy70 22 years, 2 months ago
ok ... this is just an idea that popped in my head today while I was taking a shower, so don''t flame too hard if it''s completely stupid or has been discussed to death already (I did a search but it didn''t come up with anything). Most large roleplaying games involving NPC interaction either have a limited number of NPCs or rely only on written text to handle interactive dialogues between the player and the NPC. (Sea Dogs for example has a small sound sample when you talk to an NPC, but all the usefull dialogue part is completely written). As far as I can see it there are two problems with having all dialogues recorded : 1.) large dialogues requires a lot of mem 2.) you actually need to record every sentence which can be kind of expensive. Only recording phonemes might be a solution, but I''m yet to hear any halfway convincing instance. BUT ... Did anybody ever try to record whole words? After having written all the dialogues needed by NPCs, you could compute a list of words and common sentences needed (common sentences like ''Oh Gosh!'', ''Oh Yeah!'', ''come over!'' ...) while differentiating between words marking the end of a sentence and words used in the middle of a sentence (''now'', ''now!'' and ''now.'' would be considered three different words). You save only the single words (best in mp3 or another resource saving format). When a player speaks with an NPC, you just need to combine the single words into the sentence that the NPC needs to say and output the result. I am aware that the result wouldn''t be completely natural, but probably better than any synthetic voice. (hmm ... now that I write it I realized that that is how many automated phone announdement are assembled .. I have yet to see this used in games though) So ... did any game use this technique? (the game I''m currently working on won''t need that at all I''m afraid, so son''t wait for me to try it Bye, Sammy
Advertisement
Modern speech synths use a similar method.

They have some one record some phrases, which are then used to extract phonomes and, often, common words. These are then played in the correct sequence, with pitch, speed, volume, etc modulation in order to make them flow.

I haven''t heard of it used in games, but that is how most modern speech synths work.

Do not meddle in the affairs of moderators, for they are subtle and quick to anger. ANDREW RUSSELL STUDIOS
Cool Links :: [ GD | TG | MS | NeHe | PA | SA | M&S | TA ]
Got Clue? :: [ Start Here! | Google | MSDN | GameDev.net Refrence | OGL v D3D | File Formats | Go FAQ yourself ]

I''ve seen(well, heard) this done before, in games made for the blind. Its sort of like telephone automated services, it looks up the appropriate word and sequences the sentences together.
I''ve not seen this sort of thing featured yet. It''d be hard to do with certain tones- since it''d be robotic and cheesy.

Check out how the technology is coming on with these links-
http://directory.google.com/Top/Computers/Speech_Technology/Speech_Synthesis/

In a StarCraft campaign of a friend of mine, he used both a male and female from the AT&T site. It worked quite well.
Sorry, I was speaking along the lines of recording words, not phonemes or whatnot.(in case that was a point of confusion)
They wouldn''t record words, it would take up too much space. Phonomes take up far less space, and require only slightly more processing power.

They only record common words, such as "The", "And", "A", etc.

This all comes together for quite natural speech.

Try the MS Speech SDK, or searching for Speech Synth on Google.

Do not meddle in the affairs of moderators, for they are subtle and quick to anger. ANDREW RUSSELL STUDIOS
Cool Links :: [ GD | TG | MS | NeHe | PA | SA | M&S | TA ]
Got Clue? :: [ Start Here! | Google | MSDN | GameDev.net Refrence | OGL v D3D | File Formats | Go FAQ yourself ]

Understood.
The games I was referring to were purely audio, so there was a lot of CD room to have entire sentences recorded, and the real-time aspects centered around dynamic things..numbers and card games and such.
Hi,

Around 10 years ago, Lankhor Games on Atari such as "Maupiti Island" was using Speech synthesis.

It was not using your method, because MP3 didn''t exist, and there was only 720kb on a floppy disk and 512Kb of RAM.

It was using phonemes samples AND transition between phonems sample (to smooth the result).
There was a bit of lipsynch with a 2d face, and most intonations was given back.

It was quite impressive at that time.



----
David Sporn AKA Sporniket

This topic is closed to new replies.

Advertisement