Offlining Librivox

Post your questions & get help from friendly LibriVoxers
Post Reply
Posts: 5
Joined: March 28th, 2019, 8:01 am

Post by vu2tve » April 8th, 2019, 7:00 pm

(This was originally posted here but moved to the help forum as was suggested. )


I am a volunteer of the open source Internet-in-a-Box project (, We work with remote communities around the world to provide them with high quality CC-*/CC0 licensed content in offline/semi-offline settings. Our content collection includes projects like Wikipedia, OpenStreetMap, TED, and many other projects.

I think having the librivox recordings as part of this project would be absolutely wonderful, and would love to volunteer to make that happen.

Any thoughts on where I should begin?

For starters, I'd need:
1. As much metadata as possible about the librivox recordings.
2. Download urls for the recordings themselves

I could try offlining tools like "wget -drc" and other spiders, but I'd much rather make this a proper workflow, so that it becomes possible to easily do this in the future.

Any pointers would be much appreciated!


LibriVox Admin Team
Posts: 4274
Joined: January 11th, 2011, 12:13 pm

Post by dlolso21 » April 9th, 2019, 5:18 pm


All of our published recordings are in the Public Domain so you are free to add them to an Internet-in-a-box project.

Librivox does have a very basic API that will get you some of the data you are requesting, API information is here:

All of our files are hosted on servers as "The LibriVox Free Audiobook Collection" ( ).

Information on how to the advanced search features on can be found here:

Post Reply