Readiance

Comments about LibriVox? Suggestions to improve things? News?
Post Reply
k5hsj
Posts: 810
Joined: August 17th, 2010, 12:02 am
Location: Point Richmond, CA

Post by k5hsj »

"https://readiance.org/ helps create and publish 'Read Along' books that provide an experience similar to Audible's 'Immersion Reading' feature. Unlike Audible's 'Immersion Reading,' it is platform-independent and uses public domain audiobooks. The primary source is LibriVox."

Seems to work well on the few books I sampled.

Winston
Be kind. Be interesting. Be useful. Morality ain't hard.--Jack Butler, Living in Little Rock with Miss Little Rock
tovarisch
Posts: 2936
Joined: February 24th, 2013, 7:14 am
Location: New Hampshire, USA

Post by tovarisch »

Pretty cool.

Can we use it as a tool to help proof-listening? I keep wondering if automatic speech recognition with consecutive text comparison can be employed in proof-"listening" of audio books... Might help save time.
tovarisch
  • reality prompts me to scale down my reading, sorry to say
    to PLers: do correct my pronunciation please
VfkaBT
Posts: 1305
Joined: November 28th, 2015, 7:47 am
Location: Florida

Post by VfkaBT »

I've synched a set of French recordings of Baudelaire's Le Chat here:
https://readiance.org/content/librivox/le-chat-by-charles-baudelaire

You're all great but Nathalie Mussard: wow. Listen to her Ophelie in Poemes 3.
My previous LV work: Bellona Times
Newgatenovelist
Posts: 5210
Joined: February 17th, 2015, 7:22 am

Post by Newgatenovelist »

Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.
JorWat
Posts: 1682
Joined: February 16th, 2009, 10:20 am
Location: Oxfordshire, England

Post by JorWat »

Newgatenovelist wrote:Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.
I just tried, and no, it needs raw text.
Jordan

Alcohol and Maths don't mix. So never drink and derive.
VfkaBT
Posts: 1305
Joined: November 28th, 2015, 7:47 am
Location: Florida

Post by VfkaBT »

Newgatenovelist wrote:Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.
Depends on the typeface, I think. I've tried listening to scanned books on Internet Archive via the computerized voice function and it can only read clear text. The optical scans are even worse, the machine unable to comprehend inline illustrations and other distractions.

Incidentally, you and Ms. Mussard have similar voices. You ought to collaborate on something, like a play with two sisters in it. King Lear perhaps. 8-)
My previous LV work: Bellona Times
Newgatenovelist
Posts: 5210
Joined: February 17th, 2015, 7:22 am

Post by Newgatenovelist »

Thanks. That's a shame, but at least there are plenty of LV books out there that are based on PG texts, if this is how some listeners prefer to access their audiobook.

With the IA read-aloud feature, I've only ever had silence. I thought it was just my slow internet connection. It's a pity, but at least I know it isn't just me!
Peter Why
Posts: 5834
Joined: November 24th, 2005, 3:54 am
Location: Chigwell (North-East London, U.K.)

Post by Peter Why »

I've started to put my solos on the readiance site. It's surprisingly easy to do, with a very impressive result. The creator of the site has been very helpful when I had any questions.

I'd encourage other readers / coordinators to add their recordings.

Note: If you read footnotes, it's worth going through the text and moving the footnote within the text to match where you actually read it. You can do this within the on-site window where you upload the text block, but you'll probably need to check your recording first to get the location right.

Peter
"I think, therefore I am, I think." Solomon Cohen, in Terry Pratchett's Dodger
ozdefir
Posts: 15
Joined: February 26th, 2016, 4:05 am
Contact:

Post by ozdefir »

Hi, I'm Readiance's developer. I'm glad you guys find it useful.

To answer tovarisch's question, although speech recognition could help with finding major editing errors I don't think it could be a total replacement for proof listening, that is, if you want to find the errors at word level. You would just get too many false positives and false negatives at word-level which would be annoying. Just like election polling, speech recognition algorithms are better at finding the best candidate than giving a confidence score for it.

As for the editing errors, I have an error reporting mechanism that helps to spot the time intervals where the text doesn't match the audio. It's very crude but still good enough for Readiance. For each synchronization it plots a graph of 'likelihood of mismatch vs time':
https://readiance.org/pt-scans/10628s.png

This is how it looks like when 20 seconds of audio is missing at the 17th minute:
https://readiance.org/pt-scans/10065s.png

If you think it could fit in LibriVox's proof-listening workflow let me know.

Firat
tovarisch
Posts: 2936
Joined: February 24th, 2013, 7:14 am
Location: New Hampshire, USA

Post by tovarisch »

I agree with your assessment, actually.

What I meant by helping in proof-listening was mostly identifying questionable spots, those with the lowest confidence score. Missing words or words that have been transposed in the sentence should be flagged. Repeats stumbles should be flagged. Unrecognized sounds (bumps, clicks), maybe. If the system can also, after processing the recording, display those to me (the reader), it would shorten my editing drastically. In some cases I do notice my mistakes and re-record the text, which I then need to edit. It's the places where I make mistakes that I don't know about, that I have no other way to find except to listen to the entire recording...
tovarisch
  • reality prompts me to scale down my reading, sorry to say
    to PLers: do correct my pronunciation please
ozdefir
Posts: 15
Joined: February 26th, 2016, 4:05 am
Contact:

Post by ozdefir »

I added a quote search feature which also helps to bookmark recordings at the position of the query:
https://readiance.org/audio-quote-search?phrase=When+I+see
There are a couple of glitches but it mostly works fine.

tovarisch, I used to think proof-listeners always check the recordings against the texts which as I now understand isn't in the standard PL. So if a reader misses a complete sentence it will not be very obvious in the proof-listening, right? In that case, speech recognition could actually be very beneficial as an extra safety net. When I find time I'm hoping to add an interface for that to Readiance. As for the anomalies, I think they can be best handled in Audacity. There might even be some plugins for these because those are common problems for all kinds of recordings.

Firat
Last edited by ozdefir on January 20th, 2017, 4:06 am, edited 1 time in total.
ozdefir
Posts: 15
Joined: February 26th, 2016, 4:05 am
Contact:

Post by ozdefir »

For example, in this one the reader* omits a subsentence ("so they thought they were quite safe") :
https://readiance.org/audio-quote-search?phrase=when+they+were+all+ready

Without that part the sentence is still grammatically and semantically correct, so there's no hint for a proof-listener to notice the omission.

Now if you look at the 'inconfidence' scan you can see that there's a peak at around 4:30: https://readiance.org/pt-scans/11748s.png
It's in fact the highest and largest peak, save for the outro.

So I think the best workflow would be to check the highest peak in the inconfidence scan: If it's ok, the rest must be ok. If not also check the next one and so on.

*Sorry to Barbara, I had to pick one as an example.
tovarisch
Posts: 2936
Joined: February 24th, 2013, 7:14 am
Location: New Hampshire, USA

Post by tovarisch »

That's exactly what I was talking about! Yes, our 'Standard PL' does not check against the text. However, I do "word-perfect" proof-listen all my recordings, and such "inconfidence" graph could perhaps help me concentrate first on the serious mistakes. One can always hope. :wink:
tovarisch
  • reality prompts me to scale down my reading, sorry to say
    to PLers: do correct my pronunciation please
ozdefir
Posts: 15
Joined: February 26th, 2016, 4:05 am
Contact:

Post by ozdefir »

Before I build a interface you can test your recordings by uploading them here:
https://readiance.org/node/add/public-timed-text

Make sure to choose a username that ends with "-librivox", e.g. ozdefir-librivox. If you do that, in the timedtext pages you will see a "PT Scan" link which shows the graph.

You can take a look at this demo: https://www.youtube.com/watch?v=EBzV54mt0b4

If I can get some feedback on this, it would be useful for designing the dedicated interface.
Post Reply