LibriVox API Discussion Thread

Comments about LibriVox? Suggestions to improve things? News?
Post Reply
vpanayotov
Posts: 10
Joined: March 14th, 2014, 2:42 am

Post by vpanayotov »

By the way, it seems the database used by dev.librivox.org is different from the one used by librivox.org.

Compare the 'genres' info returned by https://dev.librivox.org/api/feed/audiobooks/?offset=0&limit=1&extended=1 :

Code: Select all

<genres>
 <genre>
   <id>4</id>
   <name>Classics (Antiquity)</name>
  </genre>
</genres>
vs https://librivox.org/api/feed/audiobooks/?offset=0&limit=1&extended=1 :

Code: Select all

<genres>
  <genre>
    <id>20</id>
    <name>Literary Fiction</name>
  </genre>
  <genre>
     <id>53</id>
     <name>Published 1800 -1900</name>
  </genre>
</genres>
IMHO this may be confusing for new API users. It might be better to either somehow redirect the requests to the development host to the official one, or synchronize the databases.
Cori
Posts: 12124
Joined: November 22nd, 2005, 10:22 am
Location: Britain
Contact:

Post by Cori »

Perhaps we need to consider turning off the dev server API? It was used during our own development, it's not intended for external developers to use, hasn't been documented or publicised for any use other than when we were in the feedback phase of active development -- and it's likely to be very out of date by now. ;)
There's honestly no such thing as a stupid question -- but I'm afraid I can't rule out giving a stupid answer : : To Posterity and Beyond!
ekzemplaro
Posts: 2027
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro »

Hello vpanayotov san,

Welcome to LibriVox. I hope you enjoy it here.
And thank you for your feedback.

You are talking about the genre of 'Count of Monte Cristo' (id=47).

At the librivox.org the genre is shown as
Genre(s): Literary Fiction, Published 1800 -1900
I checked my database (http://ekzemplaro.org/librivox/statistics/). It is as
Classics (Antiquity)
I suppose the genre is changed during the last 6 months.

The problem of the current API is 'there is no way to know the changes.'
According to error reports, information of the already catalogued books are changed.
We need API to know these changes.

When librvivox becomes open source, let's develop this API.

I'll soon update the information of the book 47 (Count of Monte Cristo) at my site.

Cheers,
Masa
vpanayotov
Posts: 10
Joined: March 14th, 2014, 2:42 am

Post by vpanayotov »

Cori wrote:Perhaps we need to consider turning off the dev server API?
In my opinion, at least the URLs at https://dev.librivox.org/public/temp_info/api should be changed to point to the official server, because 3 out of the top 4 results on Google for "librivox api" lead to that page(the first post of this thread is one of these results BTW).
ekzemplaro wrote:Hello vpanayotov san,

Welcome to LibriVox. I hope you enjoy it here.
Thank you! I am sure I will enjoy it, because LibriVox has a really great community.
ekzemplaro wrote: When librvivox becomes open source, let's develop this API.
Interesting - do you know when it will be open sourced?

By the way how are we supposed to iterate over the project records using the API?
I guess one should use the 'offset' and 'limit' parameters and increase the offset in each consequent request? I noticed that this doesn't work exactly as I would expect, though.
For example https://librivox.org/api/feed/audiobooks/?offset=15&limit=5&extended=1 returns 4(although we are requesting 5) records, with the first being for project with id '78' and the last for project '83'. Moreover if we increase the 'offset' by 4 (https://librivox.org/api/feed/audiobooks/?offset=19&limit=5&extended=1), for the next request we get again the info for project 83 in the first position.
If we remove the 'extended=1' option the request https://librivox.org/api/feed/audiobooks/?offset=15&limit=5 returns 5 records, and the first record is for project '76', which is for some reason omitted in the 'extended' version. I wonder when should I stop iterating, i.e. are we guaranteed that at some point the API will not return (a 'false') empty response (0 records) even though there are more projects?

It seems that the API will require some more work, and I was wondering if the raw data that the API is using is available for download somewhere (e.g. in the form of a database dump)?

Best,
Vassil
bart
LibriVox Admin Team
Posts: 7618
Joined: February 16th, 2009, 9:17 am
Location: Utrecht, the Netherlands
Contact:

Post by bart »

vpanayotov wrote:
ekzemplaro wrote: When librvivox becomes open source, let's develop this API.
Interesting - do you know when it will be open sourced?
I know what Open Source means, regarding to software, but what does it mean regarding to a website?

Bart
Alle Nederlandstalige projecten op de Librivox Boekenplank
TriciaG
LibriVox Admin Team
Posts: 60719
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

I know what Open Source means, regarding to software, but what does it mean regarding to a website?
The workflow, etc. is basically software. It'll be put on some site somewhere for people to download and look at. The website itself (and the workflow) won't be open to changes. At least, that's my understanding.
School fiction: David Blaize
Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Humor: My Lady Nicotine
bart
LibriVox Admin Team
Posts: 7618
Joined: February 16th, 2009, 9:17 am
Location: Utrecht, the Netherlands
Contact:

Post by bart »

I don't see why people would want to look at our software.
It's the database that's interesting, and you can browse it through our api.

Bart
Alle Nederlandstalige projecten op de Librivox Boekenplank
annise
LibriVox Admin Team
Posts: 38632
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

It means other people could use the software for their purposes , just as we use PD wiki and forum software , not that they would be able to change the set we are actually using
Free working database software would be handy for many volunteer projects

Anne
vpanayotov
Posts: 10
Joined: March 14th, 2014, 2:42 am

Post by vpanayotov »

annise wrote:Free working database software would be handy for many volunteer projects
Indeed, if another project needs a web application for managing some sort of categorized items, they may at least in theory take LibriVox's code and use it as a base of their own website. And of course going open source may be beneficial to LibriVox as well. The current thread provides a good example. Apparently the 'new' API is in development for more than an year and still seems to have some rough edges here and there. If the source code for the script(s) serving these requests was available, maybe some of the people complaining here, would be willing to have a look and possibly propose concrete solutions, instead of being (effectively) left on the "mercy" of whoever is developing this. For example they could have send 'patches', and if the developer(s) like them they could be applied to the actual code running on the librivox.org - everyone wins.

BTW does anyone know if someone is still working on the API?

Vassil
annise
LibriVox Admin Team
Posts: 38632
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

We are finding it frustrating too.

Anne
bart
LibriVox Admin Team
Posts: 7618
Joined: February 16th, 2009, 9:17 am
Location: Utrecht, the Netherlands
Contact:

Post by bart »

vpanayotov wrote:BTW does anyone know if someone is still working on the API?

Vassil
We have stopped developing when the money ran out.
We would like to continue (there is still more to do than only the api) but we can't.

I don't think looking at the LV software would be benefitial to others. The organisation at LV is rather unique. If open source software is needed, it's better to develop that from scratch, so that the structure is as versetile as possible.

Bart
Alle Nederlandstalige projecten op de Librivox Boekenplank
ekzemplaro
Posts: 2027
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro »

Hello Vassil san,
vpanayotov wrote:ekzemplaro wrote:
When librvivox becomes open source, let's develop this API.

Interesting - do you know when it will be open sourced?
No, I don't know. Only 'will be open sourced' is announced. The date is still not given.
vpanayotov wrote:By the way how are we supposed to iterate over the project records using the API?
The following is my method.
#! /bin/bash
#
URL_HEAD='https://librivox.org/api/feed/audiobooks/?id='
URL_TAIL='&extended=1&format=json'
for id in {8520..8560}
do
url=$URL_HEAD$id$URL_TAIL
curl -k $url > "ex_"$id".json"
done
#
vpanayotov wrote:It seems that the API will require some more work, and I was wondering if the raw data that the API is using is available for download somewhere (e.g. in the form of a database dump)?
I requested mysqldump. The answer was no. Please just see the following thread.
viewtopic.php?p=805040#p805040
I understand the situation.
So I drop my request for a mysqldump.
bart wrote:I don't see why people would want to look at our software.
It's the database that's interesting, and you can browse it through our api.
As we are not satisfied with the current API.
I can't synchronized my database with the one at LibriVox.
The book 47 (Count of Monte Cristo) is a good example.
After looking the code I might find out a solution, which is not documented.
Or I can make a suggestion how to improve it.
vpanayotov wrote:BTW does anyone know if someone is still working on the API?
I bet nobody is working. So we have to wait for 'being open sourced'.

Cheers,
Masa
bart
LibriVox Admin Team
Posts: 7618
Joined: February 16th, 2009, 9:17 am
Location: Utrecht, the Netherlands
Contact:

Post by bart »

I'm sorry but 'going open source' was never announced and will never happen.
We do want to extend the api, but only if we find the funds.

Bart
Alle Nederlandstalige projecten op de Librivox Boekenplank
Cori
Posts: 12124
Joined: November 22nd, 2005, 10:22 am
Location: Britain
Contact:

Post by Cori »

Open sourcing was a part of the Mellon funding agreement, Bart. But exactly what becomes open source and how/when is another matter. I do think that there might be some projects that would appreciate code to organise a catalogue. Or perhaps even to submit stuff to archive.org (if they're willing to work with the Archive around access and so on.) They're probably few and far between though and I fully agree with you that our code is so tightly linked with what we do and how our workflow has evolved, that it'll probably need considerable work for anyone else to fit it into their own organisation.
There's honestly no such thing as a stupid question -- but I'm afraid I can't rule out giving a stupid answer : : To Posterity and Beyond!
vpanayotov
Posts: 10
Joined: March 14th, 2014, 2:42 am

Post by vpanayotov »

ekzemplaro wrote: No, I don't know. Only 'will be open sourced' is announced. The date is still not given.
Thank you Masa san!
Unfortunately, judging from the other answers this may not happen anytime soon, so I guess we will have to work around the current semi-broken implementation of the API.
exemplaro wrote: The following is my method.
#! /bin/bash
#
URL_HEAD='https://librivox.org/api/feed/audiobooks/?id='
URL_TAIL='&extended=1&format=json'
for id in {8520..8560}
do
url=$URL_HEAD$id$URL_TAIL
curl -k $url > "ex_"$id".json"
done
#
I see, thank you - so you are basically iterating over a predefined range of ids, and I guess you are somehow filtering out the '{"error":"Audiobooks could not be found"}' entries later.

Vassil
Post Reply