LibriVox API Discussion Thread

Comments about LibriVox? Suggestions to improve things? News?
ScottLawton
Posts: 241
Joined: October 14th, 2011, 1:38 pm

Post by ScottLawton » January 3rd, 2017, 6:03 pm

csbubbles wrote:Could please someone take a look and tell whether it's a bug on LibriVox side and needs to get fixed there, or I am doing something wrong?!
Even after following the API docs, the particular bad data you found will still show up as listen_url; that field should be ignored. The easiest way to get the correct URL is to grab the RSS.

As noted multiple times in this thread (and unless it changed while I've been off doing other things), you'll see that there aren't any resources to update the API. Still, a complete catalog can be assembled with sufficient effort.
Cheers,

Scott
Aplt1.com - alternate LibriVox catalog that puts more info up front; optional iOS app

hyiltiz
Posts: 2
Joined: January 6th, 2017, 5:20 am

Post by hyiltiz » January 6th, 2017, 1:38 pm

I am the new maintainer of the LibriVox plugin for the Amarok project. Amarok is a powerful multiplatform media player based on KDE (https://amarok.kde.org/).

Is it possible to perform API search with string (subset) matching instead of the current exact matching? For example, there are quite a few entries for Doyle's Holmes series:
https://librivox.org/api/feed/audiobooks/author/%5Edoyle

However, searching for Holmes in the title returns nothing, instead of returning some of those "Advantures of Sherlock Holmes" and "A Study in Scarlet":
https://librivox.org/api/feed/audiobooks/title/%5ESherlock

An exact match for this book also won't work:
https://librivox.org/api/feed/audiobooks/author/%5EAdventures%20of%20Sherlock%20Holmes

What is the match algorithm used for titles? I found that searching for authors is case insensitive (which is very helpful). But if searching for titles isn't based on a match, then it is very hard to actually predict what to search to get the result. For example, to actually get search results for "A Study in Scarlet" and another for "Andantures of Sherlock Holmes", what URLs should be used instead?

RuthieG
Posts: 22040
Joined: April 17th, 2008, 8:41 am
Location: Kent, England
Contact:

Post by RuthieG » January 6th, 2017, 2:45 pm

With my usual disclaimer that I know nothing about APIs, it appears to me that for titles, you need to drop the definite or indefinite articles (the, a, an) at the beginning of titles.
e.g.
https://librivox.org/api/feed/audiobooks/title/%5EStudy%20in%20Scarlet
https://librivox.org/api/feed/audiobooks/title/%5EAdventures%20of%20Sherlock%20Holmes

Ruth
My LV catalogue page | RuthieG's CataBlog of recordings | Tweet: @RuthGolding

hyiltiz
Posts: 2
Joined: January 6th, 2017, 5:20 am

Post by hyiltiz » January 6th, 2017, 4:12 pm

Thank you so much for the reply. That is very helpful for making searches based on full titles. And it is also case insensitive, which is great.

However, searching for "Scarlet" should also return "A Study in Scarlet" in addition to some other collection called "Scarlet Letter". The API is trying to find title that *starts* with the keyword "scarlet", but not titles that *has* "scarlet" in the string.

Is it already implemented / possible to do currently but not yet explicitly documented, or is it something you could improve the APIs by implementing it? I guess it might be a easy fix that corresponds to a switch for the underlying database search API.

annise
LibriVox Admin Team
Posts: 29889
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise » January 6th, 2017, 4:17 pm

Just repeating as you can't be expected to read the whole thread - we know the API is not perfect , and hope to have it recoded sometime but am unable to do any tweaking at present. If you want to read through you may find how others are handling it.

Anne

gluejar
Posts: 4
Joined: September 6th, 2017, 8:40 am

Post by gluejar » September 6th, 2017, 10:18 am

So, I asked about how to get a list of librivox urls given a gutenberg id, and was pointed here.
the suggestion was:
https://librivox.org/api/feed/audiobooks?url_text_source=gutenberg.org

but there seems to be a 50 item result limit.

Any suggestions?

Eric

dlolso21
LibriVox Admin Team
Posts: 3977
Joined: January 11th, 2011, 12:13 pm

Post by dlolso21 » September 6th, 2017, 3:46 pm

gluejar wrote:So, I asked about how to get a list of librivox urls given a gutenberg id, and was pointed here.
the suggestion was:
https://librivox.org/api/feed/audiobooks?url_text_source=gutenberg.org

but there seems to be a 50 item result limit.

Any suggestions?

Eric
Eric,

The best info on the Librivoxs API is located here -> https://librivox.org/api/info

You can increase/decrease the number of repsonses in the search result with limit and offset.

For example:
https://librivox.org/api/feed/audiobooks?limit=500
https://librivox.org/api/feed/audiobooks?limit=500&offset=500
https://librivox.org/api/feed/audiobooks?limit=500&offset=1000

The ability to search for specific fields using the API is under the DEV Notes To Do list and is not available. If it that were available, then you could specify something like

https://librivox.org/api/feed/audiobooks?url_text_source=gutenberg.org/etext/113
or
https://librivox.org/api/feed/audiobooks?url_text_source=113

David O

gluejar
Posts: 4
Joined: September 6th, 2017, 8:40 am

Post by gluejar » September 6th, 2017, 4:37 pm


gluejar
Posts: 4
Joined: September 6th, 2017, 8:40 am

Post by gluejar » September 6th, 2017, 4:51 pm

And in case you're wondering, there are 11,746 librivox books with gutenberg.org source.

Zuarrie
Posts: 3
Joined: September 6th, 2017, 1:54 pm

Post by Zuarrie » September 7th, 2017, 11:52 am

Hello everyone! I am Quarrie. I am 14, and I have been using LibriVox for many years, but I'm new to the LibriVox forums. Even though I'm young, I'm a computer science geek. I'm also about ready to head off to college. This summer, I spent some time working on an Alexa skill to tap into the LibriVox library. It is going pretty well so far. I just made this demonstration video, and would love opinions and input:

https://www.youtube.com/watch?v=rTTsmdfM6-g

I haven't tapped into the API yet, but obviously will need to. I'm back in full swing with classes again, so I don't have a lot of time right now. But, hopefully, I'll get some more time soon to flesh out the program. Anybody else working on this?

Cheers,

Q

ScottLawton
Posts: 241
Joined: October 14th, 2011, 1:38 pm

Post by ScottLawton » September 7th, 2017, 5:41 pm

The video intro could be shorter ... but the Alexa work is well done.
Cheers,

Scott
Aplt1.com - alternate LibriVox catalog that puts more info up front; optional iOS app

ranjitiyer
Posts: 6
Joined: September 14th, 2017, 7:52 am

Post by ranjitiyer » September 30th, 2017, 6:16 am

Hello,

Glad to found this discussion thread about the API. Like the poster before (Mr Q), I've also been working on an Alexa skill that would play Librivox books. I went ahead and duplicated book metadata in Elastic Search to allow me to search by Author and other fields and provide that experience to the Alexa user ('Play a Poetry book').

The way I've gone about building up the database may sound some what brute force but I think it works.

1. I range over 0 through 15000 to hit this URL to get book metadata (https://librivox.org/api/feed/audiobooks?id=12020)
2. I then use the book name and search for section meta data from Internet archive.org. It appears that the individual section Mp3 files are available for download on the internet archive.
3. I combine the metadata from librivox and internet archive and build up a JSON object describing everything about the book and store that in an Elastic Search cluster

I was wondering if there was a way to be notified in a programmatic way when a new book in published?

Open to hearing feedback!

Ranjit

ScottLawton
Posts: 241
Joined: October 14th, 2011, 1:38 pm

Post by ScottLawton » September 30th, 2017, 6:58 am

ranjitiyer wrote:I was wondering if there was a way to be notified in a programmatic way when a new book in published
I just query in a script that's run as cron job. You could fetch the RSS, or do something like this API call:
https://librivox.org/api/feed/audiobooks/?since=UNIX_TIMESTAMP_HERE&fields=id&limit=1000

Note that the book is often available before the cover image.
Cheers,

Scott
Aplt1.com - alternate LibriVox catalog that puts more info up front; optional iOS app

ranjitiyer
Posts: 6
Joined: September 14th, 2017, 7:52 am

Post by ranjitiyer » September 30th, 2017, 8:04 am

Thanks Scott. That works!

dalewking
Posts: 11
Joined: September 1st, 2010, 6:03 am

Post by dalewking » May 14th, 2018, 7:33 pm

Regarding the extended=1 bug with the API, I have stumbled on another piece of data about the bug. As others have pointed out setting extended=1 in the call can cause the books returned to be returned in a map with the number of sections as the key which can cause collisions with multiple books. What I discovered is that when requesting a single book it actually returns the expected array in the case when the book only has one section.

For example here is a book with a single section that returns an array for books:

https://librivox.org/api/feed/audiobooks/?format=json&extended=1&id=3222

But the next book has multiple sections and returns the map:

https://librivox.org/api/feed/audiobooks/?format=json&extended=1&id=3223

Post Reply