file formats & ultra-low bandwidth encoding

Comments about LibriVox? Suggestions to improve things? News?
Cloud Mountain
Posts: 4010
Joined: June 30th, 2006, 8:42 pm
Location: Jersey Shore, N.
Contact:

Post by Cloud Mountain »

Suggestion, or insight, if I may.

One of the most annoying things for listeners of audiobooks to deal with is navigating long files. What might be a solution of sorts would be to give book creators the option to place below (as part of) the book summary, a list of the contents and the time locations of say each chapter or each poem. An TOC/index of sorts.

Wouldn't this be easy to do, as it would be but a simple cut-n-paste when preping the LV cover page for the book, just as the summary is. Or am I mistaken on this? The coordinator/MC might make this as detailed or simple as they's like and it would require the creation of another database field.

Alan
thistlechick
Posts: 6170
Joined: November 30th, 2005, 12:14 pm
Location: Michigan

Post by thistlechick »

freqmod wrote:... and integrate it with existing converting software.
We don't do this ourselves... archive.org (host to our files) automatically derives our 128kbps mp3 files into 64kbps and ogg files for us...
~ Betsie
Multiple projects lead to multiple successes!
hugh
LibriVox Admin Team
Posts: 7972
Joined: September 26th, 2005, 4:14 am
Location: Montreal, QC
Contact:

Post by hugh »

... but if we had a working 128-->spx converter, we might consider integrating it into the validation process. or maybe someone could make such a converter available to many interested LV volunteers, would could do the conversions of different books, and then we could find a way to get em on, at least, the ibiblio server.

cloud: your proposal *would* add a significant amount of work to our existing catalog/management process; however, if someone were to start, say, a wiki project to make such files available for certain projects, that would be wonderful.

or phrased another way, please do not ask LibriVox to add any more complexity to the catalog process, cause it aint gonna happen; however if you think an improvement could be made (more info, better file formats), and wish to make it without adding work to the cataloging process, please do!

if you do a really good job of it, you will become the de facto lead librivox volunteer who makes it happen, and then helps us integrate it in our system without breaking a sweat.
thistlechick
Posts: 6170
Joined: November 30th, 2005, 12:14 pm
Location: Michigan

Post by thistlechick »

Oh and here's an example of a short work that does include the Table of Contents: http://librivox.org/the-parenticide-club-by-ambrose-bierce/

As Hugh indicated, if a member (any member =) were to volunteer to create a single file for any of the existing projects with a table of contents, I have a feeling we could easily persuade an MC to add that information to the individual catalog page (instead of creating a seperate wiki page)...

We wouldn't want to upload the single, huge, full-length file to the same project page at the archive.org as we wouldn't want it included in the zip file ... but we could create a single archive.org page to store all of the full-length files for all of the projects that have one (and could be added to as new ones are created).
~ Betsie
Multiple projects lead to multiple successes!
Cloud Mountain
Posts: 4010
Joined: June 30th, 2006, 8:42 pm
Location: Jersey Shore, N.
Contact:

Post by Cloud Mountain »

thistlechick wrote:Oh and here's an example of a short work that does include the Table of Contents: http://librivox.org/the-parenticide-club-by-ambrose-bierce/

As Hugh indicated, if a member (any member =) were to volunteer to create a single file for any of the existing projects with a table of contents, I have a feeling we could easily persuade an MC to add that information to the individual catalog page (instead of creating a seperate wiki page)...

We wouldn't want to upload the single, huge, full-length file to the same project page at the archive.org as we wouldn't want it included in the zip file ... but we could create a single archive.org page to store all of the full-length files for all of the projects that have one (and could be added to as new ones are created).
Thanks, this is exactly what I was talking about. Thanks for pointing it out to me. Sorry for any confusion cause my my wording. Adding info to the summary doesn't add it to the data base, as I said. Am I correct? The pages on our site here, showing the links to archive.org don't have anything to do with archive.org, they only provide links there. Cutting in 4 pieces of Ambrose Bierce is little different than pasting 10 or 20. And that work is done by the person(s) doing the recording and usually the summary. This info it makes the files more user friendly. Yes, baby steps of course.

I will have to, just as Hugh obliquely suggests, show good faith in my suggestion by offering help to at least see first hand the actual process of creating the final book info and download page. In the meanwhile I'll hold off saying more, except to say that the more I observe here the greater grows my appreciation for the overall plan and past and current work put in here on aregular, on-going basis..
[url=http://librivox.org/newcatalog/people_public.php?peopleid=254]Alan's LV catalog[/url]
thistlechick
Posts: 6170
Joined: November 30th, 2005, 12:14 pm
Location: Michigan

Post by thistlechick »

cloudmountain wrote:Adding info to the summary doesn't add it to the data base, as I said. Am I correct?
sorry, but we don't actually have a database...

Here is sort of how it works:

1. Files are recorded by members and collected by the Book Coordinator (BC) and transfered to Metadata Coordinator (MC) (if necessary)
2. MC uploads the files to temporary server space (sort of a holding area called the Validator)
3. MC Double checks all of the file names and ID3 tags, enters all of the reader's names and URLs, fixes track numbers, and runs a utility that attempts to equalize the volume across files.
4. Then from this holding stage, the files are transfered to archive.org where they are derived and the page at archive.org is generated.
5. The Validator generates the basic html for the Librivox catalog page which is copied to the LV catalog page (using WordPress) ... some tweaking is done by the MC... and .... a few other rituals are observed and the project is made available to the public =)
~ Betsie
Multiple projects lead to multiple successes!
hugh
LibriVox Admin Team
Posts: 7972
Joined: September 26th, 2005, 4:14 am
Location: Montreal, QC
Contact:

Post by hugh »

yes, i think the really important thing to realize is that the management/cataloging process is arduous, and (more or less) thankless.... that is, most people have very little idea of how much work goes into the time between the last recording being made, and a catalog page appearing on the site.

it's very unglamourous stuff, but without that the project falls apart.

the other important thing, already stated, but I will repeat: if someone wants to do something with/to/for librivox, generally the response will be: great ! go for it ! ... witness: posters, tshirts, bit torrents, recent additions RSS, the wiki, etc etc etc ... there are certainly more ... all of which were initiatives developed outside of the admin group... AND NOTE: the admin group is *just* other volunteers who a) are LV addicts, and b) usually play nice with everybody.

but good ideas need bodies to implement them, and the people who do the cataloging and management of projects (the MCs) already spend WAAAAY too much time on LibriVox than they should. hence good ideas get a: "yes! go for it!" ... and usually if the ball gets rolling, MCs and other LV addicts can't help themselves and join in to play too.
Cloud Mountain
Posts: 4010
Joined: June 30th, 2006, 8:42 pm
Location: Jersey Shore, N.
Contact:

Post by Cloud Mountain »

Thanks TC & H,

It's always so much better (for me at least) to learn in this communicative way. I appreciate the explanations and details and especially the terminology presented in this way. (It must be equally tiring to be saying pretty much the same thing over and over again.)

But I thing I can at least from these posts get A VERY GOOD idea of what IS going on behing the scenes.

Above all, the message clear and clean is, iseas and suggestions are very good, but the best way for the suggestions to materialize is to have the suggestors materialize their suggestions on their own.

It appears more to be a worker bee hive than a well ordered division of stratified labor, everything in its place, all hands on deck.

Whatever IS going on here is being done very well from the POV of the walk-in-the-doors.


Let me see what I can make of everything I've been taking in and see where I feel I can lend a hand --find, as they say, my nitch.

Again, thank you very much all.
kayray
Posts: 11828
Joined: September 26th, 2005, 9:10 am
Location: Union City, California
Contact:

Post by kayray »

cloudmountain wrote: Whatever IS going on here is being done very well from the POV of the walk-in-the-doors.
Well, that's very nice to hear :)
Kara
http://kayray.org/
--------
"Mary wished to say something very sensible into her Zoom H2 Handy Recorder, but knew not how." -- Jane Austen (& Kara)
thistlechick
Posts: 6170
Joined: November 30th, 2005, 12:14 pm
Location: Michigan

Post by thistlechick »

cloudmountain wrote: Let me see what I can make of everything I've been taking in and see where I feel I can lend a hand --find, as they say, my nitch.
You've got it exactly right.... there are so many different ways to contribute to LibriVox that each person has the opportunity to use their skills and interests in the way that best suits the individual... it also provides individuals the chance to learn new skills and develop other interests as well.

There's no rush; take your time getting used to the process here, explore any area that you find interesting, and when you are ready to dive into a project, you'll find support all around you =)
~ Betsie
Multiple projects lead to multiple successes!
freqmod
Posts: 4
Joined: July 26th, 2006, 11:48 am
Location: Norwere
Contact:

Post by freqmod »

I have changed speexenc to take mp3 files as input. (via libmad).
It can resample the files via libsamplerate, and downmix sterio to mono.

Unfortunatly some files still leaves a little noise (compared to "madplay in.mp3 -type pcm:out.wav ; speexenc <parameters> out.wav outputfile.spx") (i think it is a bug in the audio ringbuffer (most likely), or the dither code (from madplay)), however i haven't managed to squash it. Somtimes the program gives segmenting error after stopdec is printed, but then the encoding is finished, and the files should not suffer.

Recomened usage is:
speexenc(.exe) -V --monomix --resample 32000 --dtx --denoise inputfile.mp3 outputfile.spx

speexencMP3.c (containing a few more fixes than mp3speex.zip) - tested on MinGW windows, and gcc 4.0.3 (kubuntu) linux
mp3speex.zip (compiled for windows with MinGW) (includes dependencies)
Depends on the following libraries: libogg libmad libsamplerate (lib)speex

Maybe the cataloging would be easier if someone (I) made a program that tagged, encoded, normalized and uploaded the files with minimal user input. i.e ISBN,uploading password, and an interface where data from the the recorders are already present and may be edited (only) if they are wrong.

btw. audio decoding sounds much better if line 1142 in <speex source dir>/libspeex/sb_celp.c (wich contains "g= exp(((float)quant-10)/8.0);") is commented out or removed. I will maybe ask the author of speex what it does on IRC.
Last edited by freqmod on July 28th, 2006, 3:13 pm, edited 1 time in total.
kri
Posts: 5319
Joined: January 3rd, 2006, 8:34 pm
Location: Keene NH
Contact:

Post by kri »

The only problem with creating such an automated service (as far as I can see) is that I'm guessing it would take a lot of processing power, and I wonder if we have that sort of processing power at our disposal.
freqmod
Posts: 4
Joined: July 26th, 2006, 11:48 am
Location: Norwere
Contact:

Post by freqmod »

The best way of doing that would be that the recorders sent FLAC files to the catalogers which (ran a program that) encoded all the formats and published the whole book. This would result in better sound compared to now because cascading codecs is not good (not that it matters that much to speech),.

The downsides are that more bandwith is required for the catalogers (to download the FLAC files and upload the encoded files) and the recorders (to upload the FLAC files). The processing burden would be put on the on the catalogers, which have enough work to do already, even if the processing could be run unattended when the cataloger is working, sleeping etc.

This would be intergrated in a program that would take all the inputs (as lined out in my last post), and then you click process, and the files would be normalized, encoded, uploaded and added to the webpage/catalog.
Promoting speex, speech & audiobooks. Speex4Rocbox (runs on iPod): http://www.rockbox.org/tracker/task/5607
kri
Posts: 5319
Joined: January 3rd, 2006, 8:34 pm
Location: Keene NH
Contact:

Post by kri »

OK, so it's slightly (maybe sort possibly but probably not) feasible to add more work to the catalogers, however it's near to unreasonable to ask the recorders to do more work or learn more things they have to do to send a recording. We want this to be easy for the readers, so we can get more people to record. We don't want to scare away the technically shy.
Starlite
Posts: 16548
Joined: April 30th, 2006, 2:17 pm
Location: Thunder Bay Ontario, Canada

Post by Starlite »

*rasies hand* - technically shy
"Reasonable people adapt themselves to the world. Unreasonable
people attempt to adapt the world to themselves. All progress,
therefore, depends on unreasonable people." George Bernard Shaw
Post Reply