LibriVox API Discussion Thread

Comments about LibriVox? Suggestions to improve things? News?
RuthieG
Posts: 21957
Joined: April 17th, 2008, 8:41 am
Location: Kent, England
Contact:

Post by RuthieG »

I'm sorry if the API isn't supplying the needs of third-party developers. No work is being done on it (or indeed on any other aspect of our software) currently. It is not because we don't want to help :).

Ruth
My LV catalogue page | RuthieG's CataBlog of recordings | Tweet: @RuthGolding
ekzemplaro
Posts: 2027
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro »

Hello Joe san,

Welcome to LibriVox. I hope you enjoy it here.
joemorris wrote:how often is your db_catalog.json updated on github?
At least once in a month. As json is data, I don't update so often. On the contrary json at my page is updated
every week.
joemorris wrote:How is it generated?
Very tricky method. I use LibriVox API & archive.org API.
I use archive.org API to fetch the publishing date.
I use LibriVox API to fetch the data by IDs.
URL_HEAD='https://librivox.org/api/feed/audiobooks/?id='
URL_TAIL='&extended=1&format=json'
for id in {9010..9035}
do
url=$URL_HEAD$id$URL_TAIL
curl -k $url > "ex_"$id".json"
done
Then I reconstruct the database.
Needless to say, this is a waste of time. If Librivox is more cooperative, things become more easier.
What I'm doing is just mirroring the database.

Cheers,
Masa
joemorris
Posts: 12
Joined: June 25th, 2014, 9:41 pm
Location: Oakland, CA
Contact:

Post by joemorris »

Masa-san, thank you. That looks like it's a bash script, right? Is the source code for the "Then I reconstruct the database" part available?

Ruth, can I help? I can't promise to do anything on any particular timetable as a volunteer developer, but if nothing is being done anyway . . .

For credibility purposes, this is who I am: http://www.mod4llp.com/meet-the-team/joseph-morris/ My day job is as an attorney, but before law school I was a full-time programmer, and now I just do it for fun. The most-used thing I've written lately is this app, which has about 125k downloads:

https://play.google.com/store/apps/details?id=net.xenotropic.quizznworldhistfree&hl=en

Joe
annise
LibriVox Admin Team
Posts: 38571
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

Thanks for the offer , we have had a number of people offering to help with the API but can not accept any of them at present but are at the moment unable to make any changes to our software. And us Admins have no idea when it will be possible to do it. So we are all waiting with various degrees of patience or impatience.
Having said that , we are a team of volunteers and our prime directive is to produce audiobooks of all PD texts - and it seems to me LV is managing quite well in that direction.

Anne
tbook
Posts: 77
Joined: May 12th, 2012, 7:01 am

Post by tbook »

Hi Joe,

I took your page for a spin, and it looks great! I think it will be a helpful addition to the project. As a fellow developer, I guess I have to echo what you have already heard - the API doesn't really work, and isn't getting much attention. The two ways to get more complete data would be to scrape the web page (hardly ideal,) or go through Archive.org. Since the LibriVox files are all hosted by Archive.org, you can use their (better maintained) API to get a good deal of information. I don't recall how searching by author would work, but you might want to check that out.
Working on iOS and Android apps for LibriVox. You can see the comments from the apps on our web site: LibriVox Audio Books
ekzemplaro
Posts: 2027
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro »

Hello Joe san,
joemorris wrote:That looks like it's a bash script, right?
Right. It's a bash script.
joemorris wrote: Is the source code for the "Then I reconstruct the database" part available?
Here's the data & scripts.
http://ekzemplaro.org/librivox/api/new_api.tar.bz2
It's 142M. Too heavy.
I'll put it into GitHub.
Then I add explanations at the following thread.

Another LibriVox Catalog

Cheers,
Masa
joemorris
Posts: 12
Joined: June 25th, 2014, 9:41 pm
Location: Oakland, CA
Contact:

Post by joemorris »

I've started working on a way to combine the data from the Librivox API with Archive.org JSON data in a sqlite database. Right now it gets everything from the audiobooks and audiotracks API feeds from Librivox, and the track author information from archive.org.

Source: https://github.com/xenotropic/librivox-sqlite
Example output (15M) at: http://xenotropic.net/librivox-sqlite/

Once it is more stable I may cron it to run once a week and update to that same location. If there are other fields people would like to see populated from archive.org, let me know and I'll see what I can do. I haven't tested it much myself, so also possible you might run into other issues. To reduce clutter here, for those kind of requests/problems github issues are probably best https://github.com/xenotropic/librivox-sqlite/issues

Also, I noticed Hugh posted the librivox site source code to github in January.

https://github.com/LibriVox/librivox-public

which includes the API code

https://github.com/LibriVox/librivox-public/blob/fd413241ebacb0633361a69f88b06eedef8f9dc5/application/libraries/Librivox_API.php
https://github.com/LibriVox/librivox-public/blob/fd413241ebacb0633361a69f88b06eedef8f9dc5/application/controllers/api/feed.php

and, tantalizingly, the database structure

https://github.com/LibriVox/librivox-public/blob/master/sql/catalog.sql

So it would theoretically possible for a volunteer developer to set up a separate development instance of the librivox site/API and then suggest changes to the API code (or other LibriVox site code, for that matter). But without a copy of the database it would be tricky to test changes effectively; and without guidance from Hugh or someone else with engineering authority/access at LibriVox, it's not clear that any changes would ever get incorporated into the site. I emailed Hugh on Monday; haven't heard back yet. Looks like he's mostly moved on to pressbooks, but I figured it was worth a shot.

On the subject of data I'm still looking for for my Gutenberg-LibriVox mashup: I've now got the author name for each track from archive.org JSON (thanks for that pointer tbook -- also impressive ratings & downloads on your librivox android app!), but I'm still looking for text source link for tracks (Gutenberg link in most cases) and the LibriVox author id for tracks (maybe I can link by archive.org author name, haven't tried yet to see how well it joins with the LibriVox authors API). If anybody knows where either of those fields are made public (especially the etext source for individual tracks), other than scraping, let me know.
ekzemplaro
Posts: 2027
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro »

Hello Joe san,
joemorris wrote:Also, I noticed Hugh posted the librivox site source code to github in January.
https://github.com/LibriVox/librivox-public
Thank you for this useful information. I cloned it on my Ubuntu 14.04 64 bit.
joemorris wrote:So it would theoretically possible for a volunteer developer to set up a separate development instance of the librivox site/API and then suggest changes to the API code (or other LibriVox site code, for that matter).
As I have a Maria DB available, I created a database using these sqls.
mysql -uroot -ppassword

> create schema librivox_catalog_new;
> use librivox_catalog_new;

> source librivox_catalog_new.sql;
> source ../sql/2012-11-19-section_cols.sql;
> source ../sql/2012-12-11-authors_cleanup.sql;
Now I'm ready to accept data.
Once data will put into my MariaDB, I can create a wonderful catalog page.

Cheers,
Masa
luelusten
Posts: 11
Joined: January 18th, 2015, 8:02 pm
Location: UK
Contact:

Post by luelusten »

I am creating a plugin or wrap for a windows program, I trying to get my head around the API like when searching the hole of the books how do we know how many books there is to be able to use the limit and offset in the first place, I see no return that will tell us how many returns in total there is

if you call /api/feed/audiobooks will return 50 returns but no where does it say there is more

I also see that there is a undocumented value what I want to use but I don't want to use it if it gets removed later on extended=1

I am also creating a cashe option where the plugin wont make another HTTP call if its already been made, I wonder how long I should make this cashe last for, I was thinking about 5 hours ?

Also I would like to offer a option to only return complete works, is there a option for this in the search?

At the moment I am testing with extended=1 but later on this option will be only for the single book returns so I am not passing more data then really needed.

Anyone with deep understanding of what can really be sent to the API would be great as I know extended=* is not documented any other useful values not documented ?

Also what happens when two many requests are sent two the API, dose it stop giving the end user replies, do it block the IP for a said amount of time? or dose it flag a error? I want to some how provide means to stop flooding within the plugin its self, like if the plugin sends a request out two many times and gets the error for flooding it will stop working client side so the request is not even send for a given amount of time.

This option will be in the plugin for when the user uses there own HTTP calls, as the default calls for the plugin are sluggish in the first place there software would crash before it be able to flood the LibriVox API.

I know I am asking some deeper questions here but so many apps and tools use the LibriVox API so I am sure someone understands it in detail :)


Lue Lusten
Windows Desktop App Coming Soon
bart
LibriVox Admin Team
Posts: 7618
Joined: February 16th, 2009, 9:17 am
Location: Utrecht, the Netherlands
Contact:

Post by bart »

Hello Lue,
The API isn't up to scratch I'm afraid.
While developing the new system we ran out of money, so we couldn't make the API the way we really wanted it to be.
Please check this page: https://librivox.org/api/info
That's what we have now, nothing more. And there is no indication on when we can actually work on the system again.

So if you want to work with this API, please go ahead. We do not have the capacity to help you any further.
Sorry for that.

Bart
Alle Nederlandstalige projecten op de Librivox Boekenplank
luelusten
Posts: 11
Joined: January 18th, 2015, 8:02 pm
Location: UK
Contact:

Post by luelusten »

bart wrote:Hello Lue,
The API isn't up to scratch I'm afraid.
While developing the new system we ran out of money, so we couldn't make the API the way we really wanted it to be.
Please check this page: https://librivox.org/api/info
That's what we have now, nothing more. And there is no indication on when we can actually work on the system again.

So if you want to work with this API, please go ahead. We do not have the capacity to help you any further.
Sorry for that.

Bart
Well this might be the case but I been told to post here hoping others that have used the API that might be able to help, it was not aimed at the staff but others that used the API, like using the offset in the right way, and the cashe function I thinking about creating I like there idea on this.
Windows Desktop App Coming Soon
annise
LibriVox Admin Team
Posts: 38571
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

I sent him over Bart , it's the trouble with doing things by PM . I felt he was most likely to get an answer from someone who actually uses the API

As an aside - "staff" is not the right word to use here - there are no staff - we are all volunteers. It's just that some of us have access to more parts of the data base . But the Admins are not able to change the way things work behind the database.

Anne
luelusten
Posts: 11
Joined: January 18th, 2015, 8:02 pm
Location: UK
Contact:

Post by luelusten »

annise wrote:I sent him over Bart , it's the trouble with doing things by PM . I felt he was most likely to get an answer from someone who actually uses the API

As an aside - "staff" is not the right word to use here - there are no staff - we are all volunteers. It's just that some of us have access to more parts of the data base . But the Admins are not able to change the way things work behind the database.

Anne
Yes I understand this, shame you can't accept support form another dev as I am sure there are many dev's what would love to help with creating a API for this service and the API does not even have connect to the main system the site is running off it just needs just need access to the DB, but I guess there are reasons why you can't take on outside help.
Windows Desktop App Coming Soon
ekzemplaro
Posts: 2027
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro »

Hello luelusten san,

I feel my catalog will give you some suggestion.
http://ekzemplaro.org/librivox/catalog/
As of now 8991 books are listed.
If you fillter 'completed', 8362 books are listed.
Then if you filter 'English', 7181 books are listed.
Then if you filter 'Children's fiction', 417 books are listed.
Then if you filter 'Andersen' as author, 4 books are listed.

This page doesn't use LibriVox API.
Once you download the page, you can disconnect from Internet and you can continue searching.

Cheers,
Masa
luelusten
Posts: 11
Joined: January 18th, 2015, 8:02 pm
Location: UK
Contact:

Post by luelusten »

ekzemplaro wrote:Hello luelusten san,

I feel my catalog will give you some suggestion.
http://ekzemplaro.org/librivox/catalog/
As of now 8991 books are listed.
If you fillter 'completed', 8362 books are listed.
Then if you filter 'English', 7181 books are listed.
Then if you filter 'Children's fiction', 417 books are listed.
Then if you filter 'Andersen' as author, 4 books are listed.

This page doesn't use LibriVox API.
Once you download the page, you can disconnect from Internet and you can continue searching.

Cheers,
Masa
For my own project this would be great but for the API plugin I am creating sadly I can't do that I know yours is using a copy/clone of the database but for the point in the plugin its native to LibriVox's API I can do two calls to LibriVox and offer a Page Count mode well Next Page its a pain in but its doable.

If I was going to do it your way I would use the option you provided to copy the database based on the returns and then create my own API, but I don't have these resources two hand right now nor will I as I not going to be making any money from this or any apps or tools I create.
Windows Desktop App Coming Soon
Post Reply