Synthetic voices from Librivox recordings

Comments about LibriVox? Suggestions to improve things? News?
sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz » January 5th, 2011, 10:35 am

Hi,

I wanted to share some information on what we are able to do with Librivox recordings, and give people a thread to discuss about it, if they want.

I work for Toshiba Research Europe Ltd, in a group that does research on speech synthesis ("text-to-speech", "computer voices", i.e. where the computer reads out a text). As you might know, speech synthesis has improved a lot since it was first implemented, and is now often quite good for limited domains. However, it is still a long way away from humans for tasks such as reading a book. This is why there is a lot of research interest in this field. And as speech synthesis relies heavily on data these days, data (i.e. recordings of people reading books) is very important. This is how we came across Librivox. It's great :-)

We have downloaded some recordings and processed them. We can now turn recordings into individual synthetic voices, with which new texts can be synthesized. We have not yet made any texts synthesized with these voices public. However, we would like to do so. For example, we would like to be able to get feedback from people on which synthetic voices sound best. And maybe allow people to download synthesized books. And in the future, it is theoretically possible that the company would also like to use these voices for products (although it would be very unusual to use non-studio recordings directly for this).

I understand (and someone from the Librivox Admin Team has kindly confirmed this) that the public domain license also gives us the legal right to do this. However, we are planning to contact individual readers before we use the synthetic version of their voice publicly. So some of you might be getting a Private Message from me in the near future. But feel free to express your views here instead/in addition.

We know we are not the only ones using these recordings but there doesn't seem to be any previous forum discussion about this topic.

Thanks to everybody who contributed to the Librivox resource!

Dr Sabine Buchholz
Speech Technology Group
Cambridge Research Laboratory
Toshiba Research Europe Limited
http://www.toshiba-europe.com/research/crl/stg/index.html

TriciaG
LibriVox Admin Team
Posts: 49682
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG » January 5th, 2011, 10:39 am

Thanks for letting us know. I think this is great!

And yes, public domain means that you can use our recordings as you will, without permission. But it's very considerate of you to ask the readers. :thumbs:

(P.S. My later recordings - after about September 15, 2010, are of better quality due to a better microphone. This is just in case you ever wanted to use my recordings. :wink:)
Bulwer-Lytton novel: The Caxtons
Boring works 30-70 minutes long: Insomnia Collection 5
Psychological Warfare: HERE

Starlite
Posts: 16600
Joined: April 30th, 2006, 2:17 pm
Location: Thunder Bay Ontario, Canada

Post by Starlite » January 5th, 2011, 11:06 am

Wow that is exciting! So glad we can help benefit many more people like this too.

Esther :)
"Reasonable people adapt themselves to the world. Unreasonable
people attempt to adapt the world to themselves. All progress,
therefore, depends on unreasonable people." George Bernard Shaw

neckertb
Posts: 12806
Joined: March 9th, 2009, 7:47 am
Location: French in Denmark

Post by neckertb » January 5th, 2011, 11:59 am

I second Tricia's opinion :D There's all kinds of people using our stuff out there, it is very nice of you to take the time to post and explain your goals, thank you!
Interested in French recordings? :wink:
Nadine

Les enfants du capitaine Grant

Live in a death + 70 country? Have a look at Legamus

Hokuspokus
LibriVox Admin Team
Posts: 8023
Joined: October 24th, 2007, 12:17 pm
Location: Germany
Contact:

Post by Hokuspokus » January 6th, 2011, 12:38 am

Text to Speech is a wonderful thing, I use it quite a lot (there are not so many German recordings in our catalog). It's great to know that our recordings are used to create knew and better voices and I really appreciate that you take the trouble to tell us about it.
As a potential customer I'd like to say that I'm not so much interested in buying text to speech recordings. I'd rather buy the voice and generate the mp3 from the text I want to listen to.

My hope is that in the future tts will have developed so far that it will be possible to create voices easily. One day, I hope, there will be voices with a free license, so people can share the computer generated mp3 with others, like a sort of Librivox for tts. (What you get today is for personal use only or horribly expensive.)

Every research is a tiny step in this direction and it's great to know that LV recordings are a part of it.

sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz » January 6th, 2011, 2:43 am

neckertb wrote:Interested in French recordings? :wink:
Definitely. Hardly anybody in our team is native English-speaking, so we are keen on other languages as well. We just start with English because it's so big.

sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz » January 6th, 2011, 3:51 am

Hokuspokus wrote:My hope is that in the future tts will have developed so far that it will be possible to create voices easily. One day, I hope, there will be voices with a free license, so people can share the computer generated mp3 with others, like a sort of Librivox for tts. (What you get today is for personal use only or horribly expensive.)
This might be what you are looking for:

http://mary.dfki.de/ (They have different voices, including German ones, with different licenses, and you can build your own. And I'm sure they appreciate feedback from a keen TTS user.)

or http://www.cstr.ed.ac.uk/projects/festival/ and http://festvox.org/

Hokuspokus
LibriVox Admin Team
Posts: 8023
Joined: October 24th, 2007, 12:17 pm
Location: Germany
Contact:

Post by Hokuspokus » January 6th, 2011, 6:20 am

Looks very interesting.
Thank you!

sjmarky
Posts: 3067
Joined: August 28th, 2006, 8:47 pm
Location: Poictesme
Contact:

Post by sjmarky » January 8th, 2011, 8:25 am

You mean that someday I could click the text-to-voice feature on my Kindle...and hear myself??? I can't tell you how frightening that sounds.
"Bringing you yesterday's tomorrow...today!"

My website
My Librivox reader page

sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz » January 8th, 2011, 9:06 am

sjmarky wrote:You mean that someday I could click the text-to-voice feature on my Kindle...and hear myself???

Well, yes, theoretically... But I wouldn't hold my breath :-)
sjmarky wrote:I can't tell you how frightening that sounds.
Double thanks then!

BellonaTimes
Posts: 3665
Joined: February 15th, 2009, 6:25 pm
Location: Florida
Contact:

Post by BellonaTimes » January 8th, 2011, 9:06 pm

Not to throw a wet blanket on things, but this reminds me of The Stepford Wives where the heroine is seemingly innocently made to record several hundred words, only to have them (spoiler alert)
.
.
.
.
.
.
.
.
.
installed into the robot replicant of herself by the mad scientists of Stepford.
:cry:


:hmm:
They call me Threadkiller.
My Catalog Page

hugh
LibriVox Admin Team
Posts: 8001
Joined: September 26th, 2005, 4:14 am
Location: Montreal, QC
Contact:

Post by hugh » January 19th, 2011, 3:50 pm

just popping in to say: exciting project!

sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz » May 22nd, 2011, 9:55 am

Hi, so, finally (took a long time to set the website up):
Check out
http://www.speechtechgroup.co.uk/eBook/enUS/index.php
to hear some synthetic versions of real Librivox readers.

Voices and the website are just first versions, and comments are very welcome!
Thanks,
Sabine

Samanem
Posts: 1524
Joined: May 14th, 2010, 5:33 am
Location: The Headwaters of the Everglades

Post by Samanem » May 22nd, 2011, 10:09 am

Wow, very interesting! Sounds a bit fuzzy, though, as if there was a lot of noise and heavy Noise Removal was done.. Sorry, forgot to remove that hat! :wink:

No British accents, though?

A bit scary to think what they could "make" you say if the technology got really, really good, but if someone really wanted to do something like that, they could take someone's vast Catalog and then cut and paste to make them say just about anything anyway (I think we've had this discussion before... :) ).

Still, very cool and very awesome that Librivox readers are on the cutting edge of technology in this area!!

Dennis
"I ask to be allowed to have a lamp in the evening;
it is indeed wearisome sitting alone in the dark." ~ William Tyndale (1494-1536)
|

Hokuspokus
LibriVox Admin Team
Posts: 8023
Joined: October 24th, 2007, 12:17 pm
Location: Germany
Contact:

Post by Hokuspokus » May 22nd, 2011, 10:39 pm

Wow, that's pretty cool!

I like the way the voices deal with grammar, the structure of a sentence, the pauses etc. In that they all sound very human. I'm surprised that the sound quality if so different. Some sound like a very bad human recording with plosives, tinkle from too much noise cleaning, and other sound much better. I like the sound of voice 4-6, especially 6.
I have no idea who the original speakers might be.
With the voices I use on my computer I like that I can adjust the speed. To my taste they are just a tiny bit too fast. That might not be true for native speakers.

Great work!

Post Reply