Hi everyone, I started writing some code to get information about books using the Librivox api, and here is what I found so far:
1. For each book, the following fields are returned by default: id, title, description, url_text_source, language, copyright_year, num_sections, url_rss, url_zip_file, url_project, url_librivox, url_other, totaltime, totaltimesecs, and authors.
2. If the parameter extended=1 is specified, the additional fields url_iarchive, sections, genres, translators are available. url_iarchive is especially important because it provides a link to
www.archive.org where more important information is available.
3. You can get information about one book (id=52), about all books after a unix time stamp (since=123456789), about all books with an exact title, author last name, or genre (title=The Title, author=Lastname, genre=WhichGenre) or all books starting with that title, author last name, or genre (title=^X, author=^Y, genre=^Z for title starting with X, author starting with Y, genre starting with Z).
4. If there are many books returned, by default the first 50 are returned. This is changed with the parameter offset and limit, for example offset=100,limit=20 to get books number 100 to 119 (numbering starts with 0).
5. The books are returned as an array. HOWEVER there is a bug in the .php code when extended=1 is used: The books are returned in a dictionary. The dictionary key is the number of sections in the book. And since it is a dictionary, it can't contain two books with the same number of sections. If two books have 21 sections, then only the second one will be returned.
This can be seen by entering the following URLs:
https://librivox.org/api/feed/audiobooks/?format=json&extended=1&limit=10&fields={id,title}
https://librivox.org/api/feed/audiobooks/?format=json&extended=1&limit=40&fields={id,title}
When you try to download 40 titles at a time, "This Side of Paradise" gets overwritten by "Merry Adventures of Robin Hood". No problem if you want the information for a single ID, but a problem if you want to download in bulk. For example, downloading 40 books only returns 33.
I haven't written a line of php code in my life, but looking at
https://github.com/LibriVox/librivox-public/blob/master/application/libraries/Librivox_API.php
the problem seems to be that the code creating sections information re-uses the variable $key, and changing it to $sectionkey in two places might fix the problem. Would be nice if anyone with access could have a look and maybe fix it.
My workaround to get all the information about all books in bulk is not too difficult: Download the two fields "id" and "num_sections" for all books. Then look for consecutive groups where no two have the same number of sections, and download those groups. Oh well.