LibriVox Recorder

Non-reading activities need your help too!
Post Reply
tshirt
Posts: 43
Joined: January 5th, 2006, 6:12 pm
Location: West Lafayette, IN
Contact:

Post by tshirt »

Looking for help to design and implement a recorder for LibriVox volunteers
discussed in <another thread>.

Our goal is implementing an audio software that is suitable for recording
and listening audio reading of written text. The book text will be paired
with audio synchronization information, so that it can be displayed as
subtitles for the audio at listening time.

LibriVox contributers: Please share your user experiences with your
current recorder
, so we can design a better recorder for the community.


Also, if you can, please contribute:
  • * your artwork for user interface
    * your code for the underlying engine
We now have a temporary project page.

Here is a preliminary list of more advanced features:
  • * GPL !
    * automatic scrolling of book text as you read and listen.
    * minimalist configurable user interface
    * noise cancelling with software (so that you don't have to invest in
    expensive mics)
    * help for intonation of the text (e.g. high pitch words displayed in bold)
    * automatic evaluation of your intonation as you record
    * manual evaluation of the intonation as you listen to a book (i.e. to be
    given as feedback to the person that recorded the book; or to be
    incorporated into automatic evaluation of other recordings)
    * synchronised navigation of text and recorded audio (so that you can
    find a particular portion of the recording easily for post-editing)
    * ability to run the software on mobile phones and pdas
    * ability to run from the web (e.g. from librivox.org)
Last edited by tshirt on January 14th, 2006, 4:13 pm, edited 8 times in total.
vee
Posts: 585
Joined: October 10th, 2005, 7:35 pm
Location: Columbia, MD
Contact:

Post by vee »

I think the ability to place markers in the recording would be really nice. When I use SoundForge to do recording I can place a marker as I record. These markers make it easier to just continue recording and make edits later. Although this may make it more difficult for you to do the sync. Maybe sync markers can be added later?

If you do implement noise canceling, make sure that levels of noise removal can be adjusted to avoid artifacts. Things like dynamic compression and normalization would be nice to have to, along with a levels meter. Checkout
Image
Chris Vee
"You never truly understand something until you can explain it to your grandmother." - Albert Einstein
hugh
LibriVox Admin Team
Posts: 7972
Joined: September 26th, 2005, 4:14 am
Location: Montreal, QC
Contact:

Post by hugh »

One of the things in your original idea was that the text scrolling would be marked/linked somehow in the audio. and if I understood right this would make it possible to search audio for text strings, which would be so useful (not specifically for librivox - but for our users). question: what kind of additional memory would this take up? and would this be a regular mp3, or would it be something else?

In fact, if you could go backwards: to take audio, do a voice-to-text conversion, and then link the text to the audio, so you could then search ANY audio for text strings you are looking for.

Again, this doesn,t help LV much, but I see this as a HUGE application.

ASIDE:
Does anyone else think we should make the LibriVox Technology Incubator and get tons of funding from VCs? I'm joking of course, but also serious in a way. We are in many different ways on the cutting edge of internet audio, and there are MANY smart people here with interesting ideas & skills.

Hugh.
kri
Posts: 5319
Joined: January 3rd, 2006, 8:34 pm
Location: Keene NH
Contact:

Post by kri »

OK, so I'm not an expert (not really a programmer) but here's what I think.
One of the things in your original idea was that the text scrolling would be marked/linked somehow in the audio. and if I understood right this would make it possible to search audio for text strings, which would be so useful (not specifically for librivox - but for our users). question: what kind of additional memory would this take up? and would this be a regular mp3, or would it be something else?
I don't think it would take TOO much more memory. One could use a regular txt file with specified markers to denote a new line that the recorder would read and synch with the audio file. This is assuming one could use the recorder as a reader of these audio books as well, instead of using video or something, which might be the best way to do it. I'm not sure, but you might need to use something else rather than an mp3 for the audio file. I'm not entirely sure on that one, as I'm not familiar with the mp3 file format.

I think perhaps putting in a bookmark capability, for those reading/listening to the audio book would be very helpful.

Other than that I'm not of very much help to this project, as I'm not a programmer, and only slightly tech savvy in certain areas.
kri
Posts: 5319
Joined: January 3rd, 2006, 8:34 pm
Location: Keene NH
Contact:

Post by kri »

ASIDE:
Does anyone else think we should make the LibriVox Technology Incubator and get tons of funding from VCs? I'm joking of course, but also serious in a way. We are in many different ways on the cutting edge of internet audio, and there are MANY smart people here with interesting ideas & skills.
Not sure what you mean by VCs. It may be dreaming on our part, especially if we can't find those with the right skills to take this on, but this would be an awesome project if it worked out.
tshirt
Posts: 43
Joined: January 5th, 2006, 6:12 pm
Location: West Lafayette, IN
Contact:

Post by tshirt »

Hello kri,
kri wrote:It may be dreaming on our part, especially if we can't find those with the right skills to take this on, but this would be an awesome project if it worked out.
First of all we don't need to build everything bottom-up:
* festival speech synthesizer
o could be used to help people read well and evaluate their intonation
* cmu sphinx recognizer
o could be used to autoscroll the pages
* emu speech database
o could be used to store the recorded audio, and search for post-editing
... and given the promise of this project. I am pretty sure it will be picked
up by many people as soon as we have a better design description.
[/quote]
Last edited by tshirt on January 14th, 2006, 1:32 pm, edited 1 time in total.
tshirt
Posts: 43
Joined: January 5th, 2006, 6:12 pm
Location: West Lafayette, IN
Contact:

Post by tshirt »

kri wrote: Other than that I'm not of very much help to this project, as I'm not a programmer, and only slightly tech savvy in certain areas.
* There are many different visual impairments (e.g. dyslexia) which an
application like this could be useful for.

* There are plugins for mp3 player which can display lyrics for the music

* People doing speech research could use audio created with this program

* People could use this to podcast text/audio.
tshirt
Posts: 43
Joined: January 5th, 2006, 6:12 pm
Location: West Lafayette, IN
Contact:

Post by tshirt »

Hello Hugh,
hugh wrote: In fact, if you could go backwards: to take audio, do a voice-to-text conversion, and then link the text to the audio, so you could then search ANY audio for text strings you are looking for.
Speech recognition the way you described is a very computationally
expensive task. In this case we actually know the text as it's being
read. So we can go with much simpler computation, such as word
boundary detection. And even some simplified speech-recognition could be
possible; but then we cannot use off-the-shelf tools as-is.
hugh wrote: Again, this doesn,t help LV much, but I see this as a HUGE application.
I think this could grow beyond LibriVox, to include the podcast community.
Don't forget the visually impaired, people learning languages, kids learning to read, etc.
hugh
LibriVox Admin Team
Posts: 7972
Joined: September 26th, 2005, 4:14 am
Location: Montreal, QC
Contact:

Post by hugh »

tshirt I definitely agree- this goes WAAAY beyond LV. I can vaguely see all sorts of really powerful uses. really exciting stuff.

If you define it well - I have no doubts that you'll get all sorts of interested hackers. what I have learned from librivox (as many before me have learned on such projects): if you define an important problem, and lay out a clear & reasonable solution path, you will fiind the people to help you fix things.

tshirt: did you get my pm about jon udell?
BradBush
Posts: 173
Joined: October 18th, 2005, 3:41 pm
Location: Texas

Post by BradBush »

Someone needs to check out the following link which allows content to be linked to podcasts:
http://www.divicast.com/

Not sure its totally the same as what this thread was about, but it has some similarities. I don't have time now, or I would.

Brad
kri
Posts: 5319
Joined: January 3rd, 2006, 8:34 pm
Location: Keene NH
Contact:

Post by kri »

I signed up for an account, and I'll see what I can do to play with it. They seem to really be in beta, because I just signed up and am already having problems :) I'll report back as I figure out if this would be useful.
Post Reply