Readiance

k5hsj · Post by **k5hsj** » December 6th, 2016, 12:10 pm

"https://readiance.org/ helps create and publish 'Read Along' books that provide an experience similar to Audible's 'Immersion Reading' feature. Unlike Audible's 'Immersion Reading,' it is platform-independent and uses public domain audiobooks. The primary source is LibriVox."

Seems to work well on the few books I sampled.

Winston

tovarisch · Post by **tovarisch** » December 6th, 2016, 2:09 pm

Pretty cool.

Can we use it as a tool to help proof-listening? I keep wondering if automatic speech recognition with consecutive text comparison can be employed in proof-"listening" of audio books... Might help save time.

VfkaBT · Post by **VfkaBT** » December 6th, 2016, 7:40 pm

I've synched a set of French recordings of Baudelaire's Le Chat here:
https://readiance.org/content/librivox/le-chat-by-charles-baudelaire

You're all great but Nathalie Mussard: wow. Listen to her Ophelie in Poemes 3.

Newgatenovelist · Post by **Newgatenovelist** » December 8th, 2016, 7:17 am

Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.

JorWat · Post by **JorWat** » December 8th, 2016, 8:00 am

Newgatenovelist wrote:Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.

I just tried, and no, it needs raw text.

VfkaBT · Post by **VfkaBT** » December 8th, 2016, 2:31 pm

Newgatenovelist wrote:Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.

Depends on the typeface, I think. I've tried listening to scanned books on Internet Archive via the computerized voice function and it can only read clear text. The optical scans are even worse, the machine unable to comprehend inline illustrations and other distractions.

Incidentally, you and Ms. Mussard have similar voices. You ought to collaborate on something, like a play with two sisters in it. King Lear perhaps.

Newgatenovelist · Post by **Newgatenovelist** » December 9th, 2016, 11:28 am

Thanks. That's a shame, but at least there are plenty of LV books out there that are based on PG texts, if this is how some listeners prefer to access their audiobook.

With the IA read-aloud feature, I've only ever had silence. I thought it was just my slow internet connection. It's a pity, but at least I know it isn't just me!

Peter Why · Post by **Peter Why** » December 10th, 2016, 5:43 am

I've started to put my solos on the readiance site. It's surprisingly easy to do, with a very impressive result. The creator of the site has been very helpful when I had any questions.

I'd encourage other readers / coordinators to add their recordings.

Note: If you read footnotes, it's worth going through the text and moving the footnote within the text to match where you actually read it. You can do this within the on-site window where you upload the text block, but you'll probably need to check your recording first to get the location right.

Peter

ozdefir · Post by **ozdefir** » January 7th, 2017, 5:58 am

Hi, I'm Readiance's developer. I'm glad you guys find it useful.

To answer tovarisch's question, although speech recognition could help with finding major editing errors I don't think it could be a total replacement for proof listening, that is, if you want to find the errors at word level. You would just get too many false positives and false negatives at word-level which would be annoying. Just like election polling, speech recognition algorithms are better at finding the best candidate than giving a confidence score for it.

As for the editing errors, I have an error reporting mechanism that helps to spot the time intervals where the text doesn't match the audio. It's very crude but still good enough for Readiance. For each synchronization it plots a graph of 'likelihood of mismatch vs time':
https://readiance.org/pt-scans/10628s.png

This is how it looks like when 20 seconds of audio is missing at the 17th minute:
https://readiance.org/pt-scans/10065s.png

If you think it could fit in LibriVox's proof-listening workflow let me know.

Firat

tovarisch · Post by **tovarisch** » January 7th, 2017, 7:31 am

I agree with your assessment, actually.

What I meant by helping in proof-listening was mostly identifying questionable spots, those with the lowest confidence score. Missing words or words that have been transposed in the sentence should be flagged. Repeats stumbles should be flagged. Unrecognized sounds (bumps, clicks), maybe. If the system can also, after processing the recording, display those to me (the reader), it would shorten my editing drastically. In some cases I do notice my mistakes and re-record the text, which I then need to edit. It's the places where I make mistakes that I don't know about, that I have no other way to find except to listen to the entire recording...

ozdefir · Post by **ozdefir** » January 20th, 2017, 2:19 am

I added a quote search feature which also helps to bookmark recordings at the position of the query:
https://readiance.org/audio-quote-search?phrase=When+I+see
There are a couple of glitches but it mostly works fine.

tovarisch, I used to think proof-listeners always check the recordings against the texts which as I now understand isn't in the standard PL. So if a reader misses a complete sentence it will not be very obvious in the proof-listening, right? In that case, speech recognition could actually be very beneficial as an extra safety net. When I find time I'm hoping to add an interface for that to Readiance. As for the anomalies, I think they can be best handled in Audacity. There might even be some plugins for these because those are common problems for all kinds of recordings.

Firat

ozdefir · Post by **ozdefir** » January 20th, 2017, 4:03 am

For example, in this one the reader* omits a subsentence ("so they thought they were quite safe") :
https://readiance.org/audio-quote-search?phrase=when+they+were+all+ready

Without that part the sentence is still grammatically and semantically correct, so there's no hint for a proof-listener to notice the omission.

Now if you look at the 'inconfidence' scan you can see that there's a peak at around 4:30: https://readiance.org/pt-scans/11748s.png
It's in fact the highest and largest peak, save for the outro.

So I think the best workflow would be to check the highest peak in the inconfidence scan: If it's ok, the rest must be ok. If not also check the next one and so on.

*Sorry to Barbara, I had to pick one as an example.

tovarisch · Post by **tovarisch** » January 20th, 2017, 5:36 am

That's exactly what I was talking about! Yes, our 'Standard PL' does not check against the text. However, I do "word-perfect" proof-listen all my recordings, and such "inconfidence" graph could perhaps help me concentrate first on the serious mistakes. One can always hope.

ozdefir · Post by **ozdefir** » January 23rd, 2017, 4:57 am

Before I build a interface you can test your recordings by uploading them here:
https://readiance.org/node/add/public-timed-text

Make sure to choose a username that ends with "-librivox", e.g. ozdefir-librivox. If you do that, in the timedtext pages you will see a "PT Scan" link which shows the graph.

You can take a look at this demo: https://www.youtube.com/watch?v=EBzV54mt0b4

If I can get some feedback on this, it would be useful for designing the dedicated interface.