Synthetic voices from Librivox recordings

Comments about LibriVox? Suggestions to improve things? News?
sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz »

Hi,

I wanted to share some information on what we are able to do with Librivox recordings, and give people a thread to discuss about it, if they want.

I work for Toshiba Research Europe Ltd, in a group that does research on speech synthesis ("text-to-speech", "computer voices", i.e. where the computer reads out a text). As you might know, speech synthesis has improved a lot since it was first implemented, and is now often quite good for limited domains. However, it is still a long way away from humans for tasks such as reading a book. This is why there is a lot of research interest in this field. And as speech synthesis relies heavily on data these days, data (i.e. recordings of people reading books) is very important. This is how we came across Librivox. It's great :-)

We have downloaded some recordings and processed them. We can now turn recordings into individual synthetic voices, with which new texts can be synthesized. We have not yet made any texts synthesized with these voices public. However, we would like to do so. For example, we would like to be able to get feedback from people on which synthetic voices sound best. And maybe allow people to download synthesized books. And in the future, it is theoretically possible that the company would also like to use these voices for products (although it would be very unusual to use non-studio recordings directly for this).

I understand (and someone from the Librivox Admin Team has kindly confirmed this) that the public domain license also gives us the legal right to do this. However, we are planning to contact individual readers before we use the synthetic version of their voice publicly. So some of you might be getting a Private Message from me in the near future. But feel free to express your views here instead/in addition.

We know we are not the only ones using these recordings but there doesn't seem to be any previous forum discussion about this topic.

Thanks to everybody who contributed to the Librivox resource!

Dr Sabine Buchholz
Speech Technology Group
Cambridge Research Laboratory
Toshiba Research Europe Limited
http://www.toshiba-europe.com/research/crl/stg/index.html
TriciaG
LibriVox Admin Team
Posts: 60799
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

Thanks for letting us know. I think this is great!

And yes, public domain means that you can use our recordings as you will, without permission. But it's very considerate of you to ask the readers. :thumbs:

(P.S. My later recordings - after about September 15, 2010, are of better quality due to a better microphone. This is just in case you ever wanted to use my recordings. :wink:)
School fiction: David Blaize
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Starlite
Posts: 16548
Joined: April 30th, 2006, 2:17 pm
Location: Thunder Bay Ontario, Canada

Post by Starlite »

Wow that is exciting! So glad we can help benefit many more people like this too.

Esther :)
"Reasonable people adapt themselves to the world. Unreasonable
people attempt to adapt the world to themselves. All progress,
therefore, depends on unreasonable people." George Bernard Shaw
neckertb
Posts: 12799
Joined: March 9th, 2009, 7:47 am
Location: French in Denmark

Post by neckertb »

I second Tricia's opinion :D There's all kinds of people using our stuff out there, it is very nice of you to take the time to post and explain your goals, thank you!
Interested in French recordings? :wink:
Nadine

Les enfants du capitaine Grant

Live in a death + 70 country? Have a look at Legamus
Hokuspokus
Posts: 8065
Joined: October 24th, 2007, 12:17 pm
Location: Germany
Contact:

Post by Hokuspokus »

Text to Speech is a wonderful thing, I use it quite a lot (there are not so many German recordings in our catalog). It's great to know that our recordings are used to create knew and better voices and I really appreciate that you take the trouble to tell us about it.
As a potential customer I'd like to say that I'm not so much interested in buying text to speech recordings. I'd rather buy the voice and generate the mp3 from the text I want to listen to.

My hope is that in the future tts will have developed so far that it will be possible to create voices easily. One day, I hope, there will be voices with a free license, so people can share the computer generated mp3 with others, like a sort of Librivox for tts. (What you get today is for personal use only or horribly expensive.)

Every research is a tiny step in this direction and it's great to know that LV recordings are a part of it.
sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz »

neckertb wrote:Interested in French recordings? :wink:
Definitely. Hardly anybody in our team is native English-speaking, so we are keen on other languages as well. We just start with English because it's so big.
sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz »

Hokuspokus wrote:My hope is that in the future tts will have developed so far that it will be possible to create voices easily. One day, I hope, there will be voices with a free license, so people can share the computer generated mp3 with others, like a sort of Librivox for tts. (What you get today is for personal use only or horribly expensive.)
This might be what you are looking for:

http://mary.dfki.de/ (They have different voices, including German ones, with different licenses, and you can build your own. And I'm sure they appreciate feedback from a keen TTS user.)

or http://www.cstr.ed.ac.uk/projects/festival/ and http://festvox.org/
Hokuspokus
Posts: 8065
Joined: October 24th, 2007, 12:17 pm
Location: Germany
Contact:

Post by Hokuspokus »

Looks very interesting.
Thank you!
sjmarky
Posts: 4661
Joined: August 28th, 2006, 8:47 pm
Location: Sacto CA
Contact:

Post by sjmarky »

You mean that someday I could click the text-to-voice feature on my Kindle...and hear myself??? I can't tell you how frightening that sounds.
"Bringing you yesterday's tomorrow...today!"

My website
My Librivox reader page
sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz »

sjmarky wrote:You mean that someday I could click the text-to-voice feature on my Kindle...and hear myself???

Well, yes, theoretically... But I wouldn't hold my breath :-)
sjmarky wrote:I can't tell you how frightening that sounds.
Double thanks then!
BellonaTimes
Posts: 3647
Joined: February 15th, 2009, 6:25 pm
Location: Florida
Contact:

Post by BellonaTimes »

Not to throw a wet blanket on things, but this reminds me of The Stepford Wives where the heroine is seemingly innocently made to record several hundred words, only to have them (spoiler alert)
.
.
.
.
.
.
.
.
.
installed into the robot replicant of herself by the mad scientists of Stepford.
:cry:


:hmm:
They call me Threadkiller.
My Catalog Page
hugh
LibriVox Admin Team
Posts: 7972
Joined: September 26th, 2005, 4:14 am
Location: Montreal, QC
Contact:

Post by hugh »

just popping in to say: exciting project!
sbuchholz
Posts: 9
Joined: December 21st, 2010, 3:12 am

Post by sbuchholz »

Hi, so, finally (took a long time to set the website up):
Check out
http://www.speechtechgroup.co.uk/eBook/enUS/index.php
to hear some synthetic versions of real Librivox readers.

Voices and the website are just first versions, and comments are very welcome!
Thanks,
Sabine
Samanem
Posts: 1524
Joined: May 14th, 2010, 5:33 am
Location: The Headwaters of the Everglades

Post by Samanem »

Wow, very interesting! Sounds a bit fuzzy, though, as if there was a lot of noise and heavy Noise Removal was done.. Sorry, forgot to remove that hat! :wink:

No British accents, though?

A bit scary to think what they could "make" you say if the technology got really, really good, but if someone really wanted to do something like that, they could take someone's vast Catalog and then cut and paste to make them say just about anything anyway (I think we've had this discussion before... :) ).

Still, very cool and very awesome that Librivox readers are on the cutting edge of technology in this area!!

Dennis
"I ask to be allowed to have a lamp in the evening;
it is indeed wearisome sitting alone in the dark." ~ William Tyndale (1494-1536)
|
Hokuspokus
Posts: 8065
Joined: October 24th, 2007, 12:17 pm
Location: Germany
Contact:

Post by Hokuspokus »

Wow, that's pretty cool!

I like the way the voices deal with grammar, the structure of a sentence, the pauses etc. In that they all sound very human. I'm surprised that the sound quality if so different. Some sound like a very bad human recording with plosives, tinkle from too much noise cleaning, and other sound much better. I like the sound of voice 4-6, especially 6.
I have no idea who the original speakers might be.
With the voices I use on my computer I like that I can adjust the speed. To my taste they are just a tiny bit too fast. That might not be true for native speakers.

Great work!
Post Reply