Page 13 of 20

Re: LibriVox API Discussion Thread

Posted: January 3rd, 2017, 6:03 pm
by ScottLawton
csbubbles wrote:Could please someone take a look and tell whether it's a bug on LibriVox side and needs to get fixed there, or I am doing something wrong?!
Even after following the API docs, the particular bad data you found will still show up as listen_url; that field should be ignored. The easiest way to get the correct URL is to grab the RSS.

As noted multiple times in this thread (and unless it changed while I've been off doing other things), you'll see that there aren't any resources to update the API. Still, a complete catalog can be assembled with sufficient effort.

Re: LibriVox API Discussion Thread

Posted: January 6th, 2017, 1:38 pm
by hyiltiz
I am the new maintainer of the LibriVox plugin for the Amarok project. Amarok is a powerful multiplatform media player based on KDE (https://amarok.kde.org/).

Is it possible to perform API search with string (subset) matching instead of the current exact matching? For example, there are quite a few entries for Doyle's Holmes series:
https://librivox.org/api/feed/audiobooks/author/%5Edoyle

However, searching for Holmes in the title returns nothing, instead of returning some of those "Advantures of Sherlock Holmes" and "A Study in Scarlet":
https://librivox.org/api/feed/audiobooks/title/%5ESherlock

An exact match for this book also won't work:
https://librivox.org/api/feed/audiobooks/author/%5EAdventures%20of%20Sherlock%20Holmes

What is the match algorithm used for titles? I found that searching for authors is case insensitive (which is very helpful). But if searching for titles isn't based on a match, then it is very hard to actually predict what to search to get the result. For example, to actually get search results for "A Study in Scarlet" and another for "Andantures of Sherlock Holmes", what URLs should be used instead?

Re: LibriVox API Discussion Thread

Posted: January 6th, 2017, 2:45 pm
by RuthieG
With my usual disclaimer that I know nothing about APIs, it appears to me that for titles, you need to drop the definite or indefinite articles (the, a, an) at the beginning of titles.
e.g.
https://librivox.org/api/feed/audiobooks/title/%5EStudy%20in%20Scarlet
https://librivox.org/api/feed/audiobooks/title/%5EAdventures%20of%20Sherlock%20Holmes

Ruth

Re: LibriVox API Discussion Thread

Posted: January 6th, 2017, 4:12 pm
by hyiltiz
Thank you so much for the reply. That is very helpful for making searches based on full titles. And it is also case insensitive, which is great.

However, searching for "Scarlet" should also return "A Study in Scarlet" in addition to some other collection called "Scarlet Letter". The API is trying to find title that *starts* with the keyword "scarlet", but not titles that *has* "scarlet" in the string.

Is it already implemented / possible to do currently but not yet explicitly documented, or is it something you could improve the APIs by implementing it? I guess it might be a easy fix that corresponds to a switch for the underlying database search API.

Re: LibriVox API Discussion Thread

Posted: January 6th, 2017, 4:17 pm
by annise
Just repeating as you can't be expected to read the whole thread - we know the API is not perfect , and hope to have it recoded sometime but am unable to do any tweaking at present. If you want to read through you may find how others are handling it.

Anne

Re: LibriVox API Discussion Thread

Posted: September 6th, 2017, 10:18 am
by gluejar
So, I asked about how to get a list of librivox urls given a gutenberg id, and was pointed here.
the suggestion was:
https://librivox.org/api/feed/audiobooks?url_text_source=gutenberg.org

but there seems to be a 50 item result limit.

Any suggestions?

Eric

Re: LibriVox API Discussion Thread

Posted: September 6th, 2017, 3:46 pm
by dlolso21
gluejar wrote:So, I asked about how to get a list of librivox urls given a gutenberg id, and was pointed here.
the suggestion was:
https://librivox.org/api/feed/audiobooks?url_text_source=gutenberg.org

but there seems to be a 50 item result limit.

Any suggestions?

Eric
Eric,

The best info on the Librivoxs API is located here -> https://librivox.org/api/info

You can increase/decrease the number of repsonses in the search result with limit and offset.

For example:
https://librivox.org/api/feed/audiobooks?limit=500
https://librivox.org/api/feed/audiobooks?limit=500&offset=500
https://librivox.org/api/feed/audiobooks?limit=500&offset=1000

The ability to search for specific fields using the API is under the DEV Notes To Do list and is not available. If it that were available, then you could specify something like

https://librivox.org/api/feed/audiobooks?url_text_source=gutenberg.org/etext/113
or
https://librivox.org/api/feed/audiobooks?url_text_source=113

David O

Re: LibriVox API Discussion Thread

Posted: September 6th, 2017, 4:37 pm
by gluejar

Re: LibriVox API Discussion Thread

Posted: September 6th, 2017, 4:51 pm
by gluejar
And in case you're wondering, there are 11,746 librivox books with gutenberg.org source.

Re: LibriVox API Discussion Thread

Posted: September 7th, 2017, 11:52 am
by Zuarrie
Hello everyone! I am Quarrie. I am 14, and I have been using LibriVox for many years, but I'm new to the LibriVox forums. Even though I'm young, I'm a computer science geek. I'm also about ready to head off to college. This summer, I spent some time working on an Alexa skill to tap into the LibriVox library. It is going pretty well so far. I just made this demonstration video, and would love opinions and input:

https://www.youtube.com/watch?v=rTTsmdfM6-g

I haven't tapped into the API yet, but obviously will need to. I'm back in full swing with classes again, so I don't have a lot of time right now. But, hopefully, I'll get some more time soon to flesh out the program. Anybody else working on this?

Cheers,

Q

Re: LibriVox API Discussion Thread

Posted: September 7th, 2017, 5:41 pm
by ScottLawton
The video intro could be shorter ... but the Alexa work is well done.

Re: LibriVox API Discussion Thread

Posted: September 30th, 2017, 6:16 am
by ranjitiyer
Hello,

Glad to found this discussion thread about the API. Like the poster before (Mr Q), I've also been working on an Alexa skill that would play Librivox books. I went ahead and duplicated book metadata in Elastic Search to allow me to search by Author and other fields and provide that experience to the Alexa user ('Play a Poetry book').

The way I've gone about building up the database may sound some what brute force but I think it works.

1. I range over 0 through 15000 to hit this URL to get book metadata (https://librivox.org/api/feed/audiobooks?id=12020)
2. I then use the book name and search for section meta data from Internet archive.org. It appears that the individual section Mp3 files are available for download on the internet archive.
3. I combine the metadata from librivox and internet archive and build up a JSON object describing everything about the book and store that in an Elastic Search cluster

I was wondering if there was a way to be notified in a programmatic way when a new book in published?

Open to hearing feedback!

Ranjit

Re: LibriVox API Discussion Thread

Posted: September 30th, 2017, 6:58 am
by ScottLawton
ranjitiyer wrote:I was wondering if there was a way to be notified in a programmatic way when a new book in published
I just query in a script that's run as cron job. You could fetch the RSS, or do something like this API call:
https://librivox.org/api/feed/audiobooks/?since=UNIX_TIMESTAMP_HERE&fields=id&limit=1000

Note that the book is often available before the cover image.

Re: LibriVox API Discussion Thread

Posted: September 30th, 2017, 8:04 am
by ranjitiyer
Thanks Scott. That works!

Re: LibriVox API Discussion Thread

Posted: May 14th, 2018, 7:33 pm
by dalewking
Regarding the extended=1 bug with the API, I have stumbled on another piece of data about the bug. As others have pointed out setting extended=1 in the call can cause the books returned to be returned in a map with the number of sections as the key which can cause collisions with multiple books. What I discovered is that when requesting a single book it actually returns the expected array in the case when the book only has one section.

For example here is a book with a single section that returns an array for books:

https://librivox.org/api/feed/audiobooks/?format=json&extended=1&id=3222

But the next book has multiple sections and returns the map:

https://librivox.org/api/feed/audiobooks/?format=json&extended=1&id=3223