Readiance
"https://readiance.org/ helps create and publish 'Read Along' books that provide an experience similar to Audible's 'Immersion Reading' feature. Unlike Audible's 'Immersion Reading,' it is platform-independent and uses public domain audiobooks. The primary source is LibriVox."
Seems to work well on the few books I sampled.
Winston
Seems to work well on the few books I sampled.
Winston
Be kind. Be interesting. Be useful. Morality ain't hard.--Jack Butler, Living in Little Rock with Miss Little Rock
Pretty cool.
Can we use it as a tool to help proof-listening? I keep wondering if automatic speech recognition with consecutive text comparison can be employed in proof-"listening" of audio books... Might help save time.
Can we use it as a tool to help proof-listening? I keep wondering if automatic speech recognition with consecutive text comparison can be employed in proof-"listening" of audio books... Might help save time.
tovarisch
- reality prompts me to scale down my reading, sorry to say
to PLers: do correct my pronunciation please
I've synched a set of French recordings of Baudelaire's Le Chat here:
https://readiance.org/content/librivox/le-chat-by-charles-baudelaire
You're all great but Nathalie Mussard: wow. Listen to her Ophelie in Poemes 3.
https://readiance.org/content/librivox/le-chat-by-charles-baudelaire
You're all great but Nathalie Mussard: wow. Listen to her Ophelie in Poemes 3.
My previous LV work: Bellona Times
-
- Posts: 5210
- Joined: February 17th, 2015, 7:22 am
Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.
I just tried, and no, it needs raw text.Newgatenovelist wrote:Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.
Jordan
Alcohol and Maths don't mix. So never drink and derive.
Alcohol and Maths don't mix. So never drink and derive.
Depends on the typeface, I think. I've tried listening to scanned books on Internet Archive via the computerized voice function and it can only read clear text. The optical scans are even worse, the machine unable to comprehend inline illustrations and other distractions.Newgatenovelist wrote:Does it work when the text read from is a scan? The few I sampled seemed to be drawn from PG.
Incidentally, you and Ms. Mussard have similar voices. You ought to collaborate on something, like a play with two sisters in it. King Lear perhaps.
My previous LV work: Bellona Times
-
- Posts: 5210
- Joined: February 17th, 2015, 7:22 am
Thanks. That's a shame, but at least there are plenty of LV books out there that are based on PG texts, if this is how some listeners prefer to access their audiobook.
With the IA read-aloud feature, I've only ever had silence. I thought it was just my slow internet connection. It's a pity, but at least I know it isn't just me!
With the IA read-aloud feature, I've only ever had silence. I thought it was just my slow internet connection. It's a pity, but at least I know it isn't just me!
-
- Posts: 5835
- Joined: November 24th, 2005, 3:54 am
- Location: Chigwell (North-East London, U.K.)
I've started to put my solos on the readiance site. It's surprisingly easy to do, with a very impressive result. The creator of the site has been very helpful when I had any questions.
I'd encourage other readers / coordinators to add their recordings.
Note: If you read footnotes, it's worth going through the text and moving the footnote within the text to match where you actually read it. You can do this within the on-site window where you upload the text block, but you'll probably need to check your recording first to get the location right.
Peter
I'd encourage other readers / coordinators to add their recordings.
Note: If you read footnotes, it's worth going through the text and moving the footnote within the text to match where you actually read it. You can do this within the on-site window where you upload the text block, but you'll probably need to check your recording first to get the location right.
Peter
"I think, therefore I am, I think." Solomon Cohen, in Terry Pratchett's Dodger
Hi, I'm Readiance's developer. I'm glad you guys find it useful.
To answer tovarisch's question, although speech recognition could help with finding major editing errors I don't think it could be a total replacement for proof listening, that is, if you want to find the errors at word level. You would just get too many false positives and false negatives at word-level which would be annoying. Just like election polling, speech recognition algorithms are better at finding the best candidate than giving a confidence score for it.
As for the editing errors, I have an error reporting mechanism that helps to spot the time intervals where the text doesn't match the audio. It's very crude but still good enough for Readiance. For each synchronization it plots a graph of 'likelihood of mismatch vs time':
https://readiance.org/pt-scans/10628s.png
This is how it looks like when 20 seconds of audio is missing at the 17th minute:
https://readiance.org/pt-scans/10065s.png
If you think it could fit in LibriVox's proof-listening workflow let me know.
Firat
To answer tovarisch's question, although speech recognition could help with finding major editing errors I don't think it could be a total replacement for proof listening, that is, if you want to find the errors at word level. You would just get too many false positives and false negatives at word-level which would be annoying. Just like election polling, speech recognition algorithms are better at finding the best candidate than giving a confidence score for it.
As for the editing errors, I have an error reporting mechanism that helps to spot the time intervals where the text doesn't match the audio. It's very crude but still good enough for Readiance. For each synchronization it plots a graph of 'likelihood of mismatch vs time':
https://readiance.org/pt-scans/10628s.png
This is how it looks like when 20 seconds of audio is missing at the 17th minute:
https://readiance.org/pt-scans/10065s.png
If you think it could fit in LibriVox's proof-listening workflow let me know.
Firat
I agree with your assessment, actually.
What I meant by helping in proof-listening was mostly identifying questionable spots, those with the lowest confidence score. Missing words or words that have been transposed in the sentence should be flagged. Repeats stumbles should be flagged. Unrecognized sounds (bumps, clicks), maybe. If the system can also, after processing the recording, display those to me (the reader), it would shorten my editing drastically. In some cases I do notice my mistakes and re-record the text, which I then need to edit. It's the places where I make mistakes that I don't know about, that I have no other way to find except to listen to the entire recording...
What I meant by helping in proof-listening was mostly identifying questionable spots, those with the lowest confidence score. Missing words or words that have been transposed in the sentence should be flagged. Repeats stumbles should be flagged. Unrecognized sounds (bumps, clicks), maybe. If the system can also, after processing the recording, display those to me (the reader), it would shorten my editing drastically. In some cases I do notice my mistakes and re-record the text, which I then need to edit. It's the places where I make mistakes that I don't know about, that I have no other way to find except to listen to the entire recording...
tovarisch
- reality prompts me to scale down my reading, sorry to say
to PLers: do correct my pronunciation please
I added a quote search feature which also helps to bookmark recordings at the position of the query:
https://readiance.org/audio-quote-search?phrase=When+I+see
There are a couple of glitches but it mostly works fine.
tovarisch, I used to think proof-listeners always check the recordings against the texts which as I now understand isn't in the standard PL. So if a reader misses a complete sentence it will not be very obvious in the proof-listening, right? In that case, speech recognition could actually be very beneficial as an extra safety net. When I find time I'm hoping to add an interface for that to Readiance. As for the anomalies, I think they can be best handled in Audacity. There might even be some plugins for these because those are common problems for all kinds of recordings.
Firat
https://readiance.org/audio-quote-search?phrase=When+I+see
There are a couple of glitches but it mostly works fine.
tovarisch, I used to think proof-listeners always check the recordings against the texts which as I now understand isn't in the standard PL. So if a reader misses a complete sentence it will not be very obvious in the proof-listening, right? In that case, speech recognition could actually be very beneficial as an extra safety net. When I find time I'm hoping to add an interface for that to Readiance. As for the anomalies, I think they can be best handled in Audacity. There might even be some plugins for these because those are common problems for all kinds of recordings.
Firat
Last edited by ozdefir on January 20th, 2017, 4:06 am, edited 1 time in total.
For example, in this one the reader* omits a subsentence ("so they thought they were quite safe") :
https://readiance.org/audio-quote-search?phrase=when+they+were+all+ready
Without that part the sentence is still grammatically and semantically correct, so there's no hint for a proof-listener to notice the omission.
Now if you look at the 'inconfidence' scan you can see that there's a peak at around 4:30: https://readiance.org/pt-scans/11748s.png
It's in fact the highest and largest peak, save for the outro.
So I think the best workflow would be to check the highest peak in the inconfidence scan: If it's ok, the rest must be ok. If not also check the next one and so on.
*Sorry to Barbara, I had to pick one as an example.
https://readiance.org/audio-quote-search?phrase=when+they+were+all+ready
Without that part the sentence is still grammatically and semantically correct, so there's no hint for a proof-listener to notice the omission.
Now if you look at the 'inconfidence' scan you can see that there's a peak at around 4:30: https://readiance.org/pt-scans/11748s.png
It's in fact the highest and largest peak, save for the outro.
So I think the best workflow would be to check the highest peak in the inconfidence scan: If it's ok, the rest must be ok. If not also check the next one and so on.
*Sorry to Barbara, I had to pick one as an example.
That's exactly what I was talking about! Yes, our 'Standard PL' does not check against the text. However, I do "word-perfect" proof-listen all my recordings, and such "inconfidence" graph could perhaps help me concentrate first on the serious mistakes. One can always hope.
tovarisch
- reality prompts me to scale down my reading, sorry to say
to PLers: do correct my pronunciation please
Before I build a interface you can test your recordings by uploading them here:
https://readiance.org/node/add/public-timed-text
Make sure to choose a username that ends with "-librivox", e.g. ozdefir-librivox. If you do that, in the timedtext pages you will see a "PT Scan" link which shows the graph.
You can take a look at this demo: https://www.youtube.com/watch?v=EBzV54mt0b4
If I can get some feedback on this, it would be useful for designing the dedicated interface.
https://readiance.org/node/add/public-timed-text
Make sure to choose a username that ends with "-librivox", e.g. ozdefir-librivox. If you do that, in the timedtext pages you will see a "PT Scan" link which shows the graph.
You can take a look at this demo: https://www.youtube.com/watch?v=EBzV54mt0b4
If I can get some feedback on this, it would be useful for designing the dedicated interface.