Another LibriVox Catalog

Comments about LibriVox? Suggestions to improve things? News?
ekzemplaro
Posts: 2030
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro » March 22nd, 2014, 3:02 am

Hello everybody,

I've started to create a LibriVox Catalog page.
The URL is here.
http://ekzemplaro.org/librivox/catalog/
I use Javascript and jquery-2.1.0.min.js for this page.
Your feedback is welcome.
All the code is visible. I'm happy if you use these codes for your page.

'To do' list is here,
1) Listing setion titles for Collections.
As of now I search only book titles.

2) Filtering by Genres
As of now not implemented.

3) Sorting by Title & Author
As of now sorted by cataloged date.

To download files you might take a few minutes.
But once data are downloaded all operation is done on the client side.
This means it's very quick.

If you have questions, please feel free to ask.

Cheers,
Masa

annise
LibriVox Admin Team
Posts: 29669
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise » March 22nd, 2014, 3:27 am

Hello Masa

You are free to do exactly as you like - with the proviso that if whatever you do uses so much of our resources that our catalogue etc. is compromised , then you will be limited
However please keep it clear in all your postings and on your site that this is not official , and certainly that any complaints about it are directed at you.

Anne

Lucy_k_p
Posts: 2926
Joined: February 16th, 2009, 7:19 am
Location: Bath, UK
Contact:

Post by Lucy_k_p » March 22nd, 2014, 5:04 am

I'm glad to have the ability to list all the books on one page back. (You could do it with the old LV catalogue.) It makes browsing for a random title (as in, scanning to see what catches my eye) much easier.

And including the catalogued dates is helpful as well. (At one point I was putting LV books onto Goodreads, starting with the oldest, but the new catalogue had no easy way to get at those books once I'd gone through a few pages.)
So little space, so much to say.

ekzemplaro
Posts: 2030
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro » March 23rd, 2014, 4:15 am

Hello Lucy san,

Thank you for your comments.
Lucy_k_p wrote:I'm glad to have the ability to list all the books on one page back.
Lucy_k_p wrote:And including the catalogued dates is helpful as well.
Today I made 'Genres' filter work.
And I also made all language possible to select.

Cheers,
Masa

ekzemplaro
Posts: 2030
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro » April 11th, 2014, 9:13 pm

Hello Everybody,

I put the code at GitHub.
https://github.com/ekzemplaro/librivox_catalog

I suppose I put all the necessary data.
But if something is lacking, please let me know.
Just 'it works' is a useful information for me. As I am a newbie at GitHub.

Cheers,
Masa

ekzemplaro
Posts: 2030
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro » July 1st, 2014, 5:50 am

Hello everybody,

For those who wants to generate JSON, I put the code & data into GitHub.
https://github.com/ekzemplaro/librivox_database/

I use bash & node.js on Ubuntu 14.04.

There're 2 folders.
ext ---- to generate combied.json
combied.json is used at http://ekzemplaro.org/librivox/statistics/

catalog --- to generate db_catalog.json
db_catalog.json is used at http://ekzemplaro.org/librivsox/catalog/

db_catalog.json includes information in combied.json.

Please feel free to ask, if you have any questions.

Cheers,
Masa

ScottLawton
Posts: 241
Joined: October 14th, 2011, 1:38 pm

Post by ScottLawton » October 19th, 2014, 5:56 pm

Masa san,

I am also frustrated that the API is missing lots of useful information, so I would like to learn more about what you and others are doing with the data.

As a starting point, please consider adding a license statement to your README.md file on github. I prefer the MIT License, though BSD is just as good; Apache is fine ... or even public domain, like Librivox books!

(An aside to the admins: we can move this discussion elsewhere if you prefer.)

Cheers,

Scott

ekzemplaro
Posts: 2030
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro » October 20th, 2014, 4:00 am

Hello Scott san,

Thank you for your advice.
ScottLawton wrote:As a starting point, please consider adding a license statement to your README.md file on github.
Done. I always welcome your advice.
ScottLawton wrote:I am also frustrated that the API is missing lots of useful information,
We've been requesting mysqldump of the catalog data base.
The catalog data base is run on MySQL. So I guess the Server Administrator is creating a mysqldump every week or so. What we're asking is to copy the file to somewhere on the http server. Or make a link to the dump data.
It can be done by one command 'ln -s folder/mysqldump_latest.dump .'

Cheers,
Masa

ScottLawton
Posts: 241
Joined: October 14th, 2011, 1:38 pm

Post by ScottLawton » October 20th, 2014, 12:25 pm

Masa san,

Thanks for adding the MIT license! I think it's important for projects to be clear about usage since there are 2 opposing schools of thought.

Let's assume that we won't get a DB dump. Instead, 'outsiders' can work together to create something that we find useful. If we can eventually expand that in a way that helps other volunteers, so much the better!

Could you point me to a description of which data you're getting from Librivox and which from Archive.org? Perhaps a quick summary of your data gathering scripts would be useful.

e.g. https://github.com/ekzemplaro/librivox_catalog/blob/master/go_get just points to something that lives in /home/uchida/ but that script does not seem to be included on Github.

Thanks,

Scott
Cheers,

Scott
Aplt1.com - alternate LibriVox catalog that puts more info up front; optional iOS app

ekzemplaro
Posts: 2030
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro » October 21st, 2014, 2:58 am

Hello Scott san,
ScottLawton wrote:Could you point me to a description of which data you're getting from Librivox and which from Archive.org?
I fetch 'public date' from archive.org another information are fetched from librivox.org.
Here're samples by which I fetch information from librivox.org.
Here're samples by which I fetch information from archive.org.
ScottLawton wrote:https://github.com/ekzemplaro/librivox_catalog/blob/master/go_get just points to something that lives in /home/uchida/ but that script does not seem to be included on Github.
Fetched information are mereged into big json files.
combiend.json is database for 'librivox_statistics'.
db_catalog.json is database for 'librivox_catalog'.

Data fetch & json creation are show here.
https://github.com/ekzemplaro/librivox_database
combied.json is calculated under 'ext'
db_catalog.json is calculated under 'catalog'.

db_catalog.json includes information in 'combied.json'.

go_get command just copies 'db_catalog.json' after calcuation at my server.

Cheers,
Masa

ScottLawton
Posts: 241
Joined: October 14th, 2011, 1:38 pm

Post by ScottLawton » October 21st, 2014, 8:44 am

Masa san,

Very helpful, thanks! I see that the archive.org url is in the extended API result as url_iarchive so ?output=json is easy from there.

I will keep you posted on my progress. (And, of course I might have more questions!)

Scott
Cheers,

Scott
Aplt1.com - alternate LibriVox catalog that puts more info up front; optional iOS app

ekzemplaro
Posts: 2030
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro » November 11th, 2014, 4:55 am

Hello,

Today I noticed they started to use un-used numbers.

Kitten --- 8505

This project started on October 16th, 2014.

Solitude --- id is not known yet.

This is a recently cataloged weekly poetry. But the id must be under 8000.

As of now the biggest id is 9356.

Cheers,
Masa

Carolin
LibriVox Admin Team
Posts: 38867
Joined: May 26th, 2010, 8:54 am
Location: the Netherlands
Contact:

Post by Carolin » November 11th, 2014, 8:45 am

ekzemplaro wrote: Solitude --- id is not known yet.

This is a recently cataloged weekly poetry. But the id must be under 8000.

As of now the biggest id is 9356.
why would it be negative to reuse the unused numbers? does it cause a problem?
Carolin

Let me know if you are looking for a book suggestion for your next solo project!

TriciaG
LibriVox Admin Team
Posts: 39292
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG » November 11th, 2014, 2:17 pm

Solitude --- id is not known yet.

This is a recently cataloged weekly poetry. But the id must be under 8000.
Yep - it's 3300. 8-)
Fiction, open sections mostly adventure in Australia: It Is Never too Late to Mend
Irish Home Rule Arguments: Handbook on Home Rule
Sci-Fi removing memories: Dr. Heidenhoff's Process
The Curious Lore of Precious Stones

ekzemplaro
Posts: 2030
Joined: December 31st, 2011, 7:17 am
Location: Tochigi,Japan
Contact:

Post by ekzemplaro » November 12th, 2014, 3:03 am

Hello,

Thank you for your messages.
Carolin wrote:why would it be negative to reuse the unused numbers? does it cause a problem?
Yesterday I noticed 'Kitten' & 'Solitude' don't appear on my catalog.
So I added a logic to search over 8000 ids, then 'Kitten' appeared on my catalog.
There were 2 possibilities for 'Solitude'.
a) Re-using an id under 8000.
b) Using an id over 10000.

So I just questioned it, and I got an answer.
TriciaG wrote:Yep - it's 3300.
Thank you. I searched ids which are not used between 1000 & 8000. And I located it.
Now 'Solitude' appears on my catalog.

Needless to say this method is tricky. According to my humble opinion catalog information should be open to public.
It will help developpers for homepage and appli. We just request making mysqldump available.

Cheers,
Masa

Post Reply