I am creating a program for listening audio books.

Get to know your fellow readers and tell us a little about yourself
Post Reply
KD
Posts: 12
Joined: March 31st, 2006, 7:17 pm
Location: Toronto
Contact:

Post by KD »

This is a great site!
I think about using a few audio books from this site for creating some books in the audio-text synchronized format with the software I am developing.

http://www.interactiveselfstudy.com/
ChipDoc
Posts: 1277
Joined: January 4th, 2006, 3:11 am
Location: Tampa, FL
Contact:

Post by ChipDoc »

I've got to say that's a fascinating concept you're working with KD. Thanks for showing us. I sure hope it works out well for you; turning good ideas into cash flow is an art unto itself
-Chip
Retired to Colorado
The man who does not read good books has no advantage over the man who cannot read them.
~Mark Twain
hugh
LibriVox Admin Team
Posts: 7972
Joined: September 26th, 2005, 4:14 am
Location: Montreal, QC
Contact:

Post by hugh »

hi KD cool project. you might want to check out a LibriVox spin off project (not sure where it's at), called revoxer:

http://revoxer.sourceforge.net/
tshirt
Posts: 43
Joined: January 5th, 2006, 6:12 pm
Location: West Lafayette, IN
Contact:

Post by tshirt »

Hi Konstantin,

It's great that we have more than one project that
tries to make LibriVox content more accessible!

Could you tell us a little about the format of your
files? I know that there are existing standarts for
synchronized audio/text. Do you use any of them?

-umut
KD
Posts: 12
Joined: March 31st, 2006, 7:17 pm
Location: Toronto
Contact:

Post by KD »

hugh wrote:hi KD cool project. you might want to check out a LibriVox spin off project (not sure where it's at), called revoxer:

http://revoxer.sourceforge.net/
Actually I found the revoxer first and from there I came to the Librivox website.
tshirt wrote:Hi Konstantin,

It's great that we have more than one project that
tries to make LibriVox content more accessible!

Could you tell us a little about the format of your
files? I know that there are existing standarts for
synchronized audio/text. Do you use any of them?

-umut
The format is quite simple in order to save time for development, because it was used as a demo and actually still is remaining on the demo level. The program can play only wav files and the interface is still on a very basic level.

I've just put one of the stories on the web site (only it is only third synchronized and I've not found yet translations other than in Russian).

If you don't want to download it with the wav file just use http://www.interactiveselfstudy.com/downloads/TamingTheBicycleNoAudio.zip

You can see the format there. File book.txt contains text of the book, files book_t1, book_t2 ? are containing translations to other languages and file book_snd.txt contains the information how the sound corresponding with the text. In the other books this file is inside an exe file. All files are Unicode. Each sample is described by two lines: first line contains an index of the paragraph and indexes of the starting letter and last letter inside the paragraph, the second line contains or translation or sound period. In the beginning I edited those files manually. Later I wrote a few programs to make the synchronization process faster. For sound synchronization it is based on the main program, parts of the text and the corresponding sound samples are selected by mouse clicks and it looks like this.

Image


I think it takes about 5-7 minutes to synchronize 1 minute of audio.
It uses voice activity detection by analyzing a wav file (actually very simple algorithm).


Some other small features are hardcoded for each book separately: go to chapters, illustration positions, language menu of available translations just because of lazyness.

If someone wants to try to synchronize some book, I can put these tools on the web site and describe how to use it.
tshirt
Posts: 43
Joined: January 5th, 2006, 6:12 pm
Location: West Lafayette, IN
Contact:

Post by tshirt »

KD wrote: I think it takes about 5-7 minutes to synchronize 1 minute of audio.
It uses voice activity detection by analyzing a wav file (actually very
simple algorithm).
Wow, this sounds great. Do you need user input for this algorithm?
I am guessing that 5-7 minutes is the computation time.

ReVoxer design uses human input for synchronization. Have you ever
tried such an approach? With an intuitive user interface, that could
save a lot of time for synchronization, i.e. about the same time for
annotation and listening a recording. We are hoping that as more people
annotate the audio the synchronization becomes better.

Again, looks like you did a great job there. Thanks for sharing it with us.

Do you think you can port it to run as a java applet?
KD
Posts: 12
Joined: March 31st, 2006, 7:17 pm
Location: Toronto
Contact:

Post by KD »

Hello,
tshirt wrote: Do you need user input for this algorithm?
I am guessing that 5-7 minutes is the computation time.
Yes, it requires user input, by double clicking into the text. But this process is fast enough, almost like playing the Minesweeper.
For full automation of this process was necessary to use some speech recognition - it was too much work and narrowed to the particular number of languages, may be later I will write something more productive.
tshirt wrote: Do you think you can port it to run as a java applet?
No, it is written on C++.

I put this ?Synchronizing package? onto the web site http://www.interactiveselfstudy.com/downloads/TamingTheBicycle_map.zip so you can try it.

It is not very user friendly but still is simple enough. I tried to write some kind of instruction.

- Copy the wav file also into the ?map? subdirectory,
- Run SoundMapper.exe, give Noise level some number between 100-300. It should produce two files, book.txt and book_snd.txt in the subdirectory.
- Run InterBooksSynchSound.exe, it opens a window in the bottom, showing time stamps of the sound ranges.
- Run InterBooksSynch.exe.exe, it opens a window on the top with the text of the books.
- Play audio in the bottom window using right mouse button for viewing what time stamps forming complete phrases.
- Double click in the upper window with the right mouse button into the first symbol of the phrase and into its last symbol, it should change its colour into brown or magenta.
- Double click in the lower window with the left mouse button into the first time line of the phrase and into its last or the same line, it should change its colour into blue or red.
- After repeating this procedure a few times in the file menu of the upper window choose ?Reload Sound Synchro? option and check how it was synchronized by playing the audio in the upper window and you can see the synchronized time stamps also in the book_snd.txt file in the main directory.
ab2525
Posts: 628
Joined: June 20th, 2006, 8:55 pm
Location: Woodbridge, Virginia
Contact:

Post by ab2525 »

How do you make the exe?
What's this little box thingy for? Oh! [color=red]C[/color][color=orange]O[/color][color=yellow]L[/color][color=blue]O[/color][color=indigo]R[/color]
KD
Posts: 12
Joined: March 31st, 2006, 7:17 pm
Location: Toronto
Contact:

Post by KD »

ab2525 wrote:How do you make the exe?
I added general exe file into the zip file, but now for each book exe file is created with some hardcoded features: "Go to" and "Translation" menus and illustration positions. I did it in order to save time for development. Probably, when number of such books grows I will make it more general -with one exe file for any book. If you want to make such book yourself I can build such exe file for you.
ab2525
Posts: 628
Joined: June 20th, 2006, 8:55 pm
Location: Woodbridge, Virginia
Contact:

Post by ab2525 »

cool... can you do that for me?
What's this little box thingy for? Oh! [color=red]C[/color][color=orange]O[/color][color=yellow]L[/color][color=blue]O[/color][color=indigo]R[/color]
KD
Posts: 12
Joined: March 31st, 2006, 7:17 pm
Location: Toronto
Contact:

Post by KD »

ab2525 wrote:cool... can you do that for me?
Sure, only I need more information - text, translations, illustrations.

My E-mail isslb@canada.com.
ab2525
Posts: 628
Joined: June 20th, 2006, 8:55 pm
Location: Woodbridge, Virginia
Contact:

Post by ab2525 »

what if there are no translations or illus.?
What's this little box thingy for? Oh! [color=red]C[/color][color=orange]O[/color][color=yellow]L[/color][color=blue]O[/color][color=indigo]R[/color]
KD
Posts: 12
Joined: March 31st, 2006, 7:17 pm
Location: Toronto
Contact:

Post by KD »

ab2525 wrote:what if there are no translations or illus.?
No problem. It is optional but not required. Do you have the text and audiofiles?
Post Reply