LibriVox API Discussion Thread

Comments about LibriVox? Suggestions to improve things? News?
annise
LibriVox Admin Team
Posts: 38542
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

https://librivox.org/a-woman-at-bay-by-nicholas-carter/ is our "catalogue page "

https://archive.org/details/woman_at_bay_2103_librivox is the Archive project details page - normally referred to as "Archive Page"
It is a bit confusing I agree
We provide information to Archive for some of the things on the page but that is all. Some of the app providers use the API from Archive to get information that is not available in the current LV one but this would not help you with wiki links.

Anne
msfry
Posts: 11628
Joined: June 4th, 2013, 9:09 am
Location: Baton Rouge, Louisiana
Contact:

Post by msfry »

williamjones wrote: March 29th, 2021, 4:04 pm Michele (msfry) has been finding wikipedia(?) pages for Librivox books. In the wikipedia page she is looking in the in the External Links section to see whether there is a link back to the Librivox book page. Similarly for authors. Then inserting those missing links. There are author pages in wikipedia which do not link back to the Librivox Author page which lists all the Librivox books for that author. I hope I am saying that right. When I'm programming I keep open images of the various LV and wiki pages. I have asked several times for consistent names for the various pages and not received consistent responses, or sometimes any response at all.
There is a Video on the Wikipedia Links page on our Wiki that illustrates my process (which is still being fine-tuned), and a printable Instruction Sheet. I have also Zoomed with Bill to illustrate my process. I refer to the pages thus:
LV Book Page (the catalogued book)
WP Book Page (links back to all iterations of the book)
LV Author Page (all projects by that author)
WP Author Page (the author's page, which links back to the LV author page)
I never refer to Internet Archive in my process.

I never comb through Wikipedia to find pages that relate to Librivox pages. I rely on the MC to add book and author links to the LV Book Page if they are available. I presume they have a process. I just open each link on the LV Book Page to see if the WP pages listed link back to our pages. If they don't I add a link.

Occasionally I run across a WP Book or Author Page that has been made since our book catalogued. I let David Olson know and using his MC magic wand, he is able to add those links onto our Catalogued page, thus haphazardly once in a while updating older catalogued projects.

This method leaves some holes, but it fills a bunch of holes too. It is slow and tedious and will take a small army of volunteers to complete and keep current, unless we can convince MC's to add the link-adding routine to every new project they catalogue. I am willing to train anyone who wants to learn to do that. It's relatively simple once you get the hang of it.

Bill has been working several months on developing spreadsheets that automate adding every author and book as they occur, check for extant WP links, fill them in if available. He projects that his program can eventually create the links on WP without a human hand intervening. The elusive bit is that someone needs to direct the program to scour LV projects and/or Wikipedia every week or so to update it. And there is no fail-safe to be sure it's done right. And not every project fits into a clean mold. I have run across anomalies such as duplicate Project ID's, collections, series, multiple authors on one book, etc.

While I welcome automating this process if it can be done, I am frankly skeptical. But then I don't know much about scouring web pages.

Meanwhile I will continue developing my process of adding links manually, and enticing other volunteers to help.
williamjones
Posts: 2248
Joined: April 26th, 2016, 7:47 pm
Location: Florida

Post by williamjones »

msfry wrote: March 29th, 2021, 5:30 pm
<snip>

While I welcome automating this process if it can be done, I am frankly skeptical. But then I don't know much about scouring web pages.

Meanwhile I will continue developing my process of adding links manually, and enticing other volunteers to help.
Oh Ye of little faith! Of course it can be done *IF* a firm recipe can be written up. Such a recipe would end up being Pseudo-code for the program that does it.
-- Bill Jones

When you think that you have exhausted all possibilities, remember this: you haven't.
--- Thomas Edison
annise
LibriVox Admin Team
Posts: 38542
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

I do suspect that Wikipedia has survived because it does not allow automatic entries :D . They certainly would need it based on our experience when our Wiki filled with links to Russian porn sites.
The ungodly would love to get their links into any site like PG and Wikipedia and Internet Archive and us - as would little boys who think they are so clever :D

Anne

That's a good explanation Michele - when I get a chance I might shift the non API stuff somewhere else.

Anne
CSCO
Posts: 393
Joined: April 6th, 2010, 10:48 am
Location: Toyokawa, Japan

Post by CSCO »

Today's Reader is good. When the page is reloaded, a random reader and the one's random recording could be shown.
!!!!!!.!!!!!!.!!!!.!!!!!!!!!..!!!.!!!!!!!!!!!...!!!!!!!!!.!!!!!!.!!!!.!!!!!!.!!!!
No way. He stole away a pretty thing, you know.
That's your heart.
!!!!.!!!!!!.!!!!.!!!!!!!!!..!!!.!!!!!!!!!!!...!!!!!.!!!!!!.!!!!!!!!.!!!!!!.!!!!!!
williamjones
Posts: 2248
Joined: April 26th, 2016, 7:47 pm
Location: Florida

Post by williamjones »

CSCO wrote: March 30th, 2021, 12:19 pm Today's Reader is good. When the page is reloaded, a random reader and the one's random recording could be shown.
What is this all about? It looks like gibberish.
-- Bill Jones

When you think that you have exhausted all possibilities, remember this: you haven't.
--- Thomas Edison
annise
LibriVox Admin Team
Posts: 38542
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

Maybe not the right place for the suggestion - but I'm not sure where I would finish up in a Japanese language forum. A daily author button with a random selection from that author would give some interesting results,

Anne
CSCO
Posts: 393
Joined: April 6th, 2010, 10:48 am
Location: Toyokawa, Japan

Post by CSCO »

Hi, ladies and gentlemen.


Please excuse the noises I made. What I meant is the idea (Today's Reader) would help the linkers greatly. When we eat popcorn in a pot, random picks could empty the pot eventually. Hitting the F5 key is not a heavy task to get information for the next Hyperlink to be added to some page manually, is it?
!!!!!!.!!!!!!.!!!!.!!!!!!!!!..!!!.!!!!!!!!!!!...!!!!!!!!!.!!!!!!.!!!!.!!!!!!.!!!!
No way. He stole away a pretty thing, you know.
That's your heart.
!!!!.!!!!!!.!!!!.!!!!!!!!!..!!!.!!!!!!!!!!!...!!!!!.!!!!!!.!!!!!!!!.!!!!!!.!!!!!!
TriciaG
LibriVox Admin Team
Posts: 60512
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

CSCO - featuring a random reader doesn't relate to the problems they are having with the API. Adding readers, projects, or authors randomly would be a very messy process. One could not tell what was not done yet.

Randomly shooting paintballs at a wall might eventually get it fully painted, but it's a very inefficient way of doing it.
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Humor: My Lady Nicotine
CSCO
Posts: 393
Joined: April 6th, 2010, 10:48 am
Location: Toyokawa, Japan

Post by CSCO »

Hi, TriciaG-san.

Oh, yes, it is an inefficiant way. For example, a collaborative work has multiple readers so that a collaborative work is shown multiple times eventually. So, the linkers must take care not to add duplicate links on a page. Oh, it is an inefficiant way... But if the linkers don't have a list of needed information, the poor way would help them a little.

If we can view a perfect list of the stored audio files and the related information displayed on a LV page, a third party developer can copy it and paste it (to import it) to MS Excel spreadsheets without HTTP some special requests. He (and even kids) can build the databases easily without a LV API expansion.

Oh, I said silly things... I'm sorry.
Last edited by CSCO on April 2nd, 2021, 10:29 am, edited 2 times in total.
!!!!!!.!!!!!!.!!!!.!!!!!!!!!..!!!.!!!!!!!!!!!...!!!!!!!!!.!!!!!!.!!!!.!!!!!!.!!!!
No way. He stole away a pretty thing, you know.
That's your heart.
!!!!.!!!!!!.!!!!.!!!!!!!!!..!!!.!!!!!!!!!!!...!!!!!.!!!!!!.!!!!!!!!.!!!!!!.!!!!!!
williamjones
Posts: 2248
Joined: April 26th, 2016, 7:47 pm
Location: Florida

Post by williamjones »

annise wrote: March 29th, 2021, 4:38 pm https://librivox.org/a-woman-at-bay-by-nicholas-carter/ is our "catalogue page "

https://archive.org/details/woman_at_bay_2103_librivox is the Archive project details page - normally referred to as "Archive Page"
It is a bit confusing I agree
We provide information to Archive for some of the things on the page but that is all. Some of the app providers use the API from Archive to get information that is not available in the current LV one but this would not help you with wiki links.

Anne
Thank you very much!
I'm assuming whether I talk to Lynne/t, Michele or to TriciaG using these names, there will be unanimous agreement about what appears on the screen.
-- Bill Jones

When you think that you have exhausted all possibilities, remember this: you haven't.
--- Thomas Edison
williamjones
Posts: 2248
Joined: April 26th, 2016, 7:47 pm
Location: Florida

Post by williamjones »

msfry wrote: March 29th, 2021, 5:30 pm I refer to the pages thus:
LV Book Page (the catalogued book)
WP Book Page (links back to all iterations of the book)
LV Author Page (all projects by that author)
WP Author Page (the author's page, which links back to the LV author page)
I never refer to Internet Archive in my process.
<snip>
Thank you, Michele!
This helps a lot.
-- Bill Jones

When you think that you have exhausted all possibilities, remember this: you haven't.
--- Thomas Edison
williamjones
Posts: 2248
Joined: April 26th, 2016, 7:47 pm
Location: Florida

Post by williamjones »

I hope this post be read mostly by
alg1001 (amy)
msfry (Michele)
TriciaG (Tricia)
knotyouraveragejo
annise (anne)
LynneT (Lynne)

BIG QUESTION:HOW CAN I GET TWO .ZIP FILES TO INTERESTED PARTIES. One easy way is for y'all to send me an email and thereby give me your email address to which I'll send this data as an email attachment. I don't want to type my email address in plain text here in this forum, so I hope you MCs have my email address somewhere in your system.

I have gathered together a herd of Excel spread sheets which present the results of my struggling with the API to give me all the LV data. All the basic data fields are present in an MS ACCESS database, and I've created some queries (or "views" in the SQL language of the 70s) which are hopefully useful to some of you folks. I have exported these queries/views/tables as Excel Spread Sheets. When Microsoft broke off ACCESS from the rest of the OFFICE suite, the price was substantial and most people were willing to live with the crude facilities within Excel.

Here is an inventory of what I can offer in the form of Excel spread sheets (ZIPPED):
AllbookData.zip
Contains AllBookData.xlsx
This is the entirety of the ALL Books data base
16,488 books
Fields:
ID - the unique BookID for each book
Book_Title - straight text title
LV_Url - Librivox page link
Linked - Wikipedia book page contains link to LV
Wikipedia_Book_Page link
CatDate - Date Catalogued
RT - Running Time (txt)
RThrs - Running Time hours
RTmins -Running Time minutes
RTsecs - Running Time seconds
MiscNum - not used
Author1 - primary author link
A1L - This author Linked to LV?
Author2 - another author link
A2L - This author Linked to LV?
ReadBy - Link to ReadBy
BC - link to BC
MC - link to MC
MiscTXT - unused
MiscDate - unused

AllBooksViews.zip
Contains several spreadsheets of selective data from the All Books database.
Views:
Authors_By_Name_And_Count.xlsx - list of all authors and the number of books in the database
Has_Wiki_Page_But_No_LV_url.xlsx - list of books which have a wikipedia page but no LV_URL page (157 books)
Wiki_Books_Not_Linking_To_LV_Book.xlsx - books whose wikipedia page do NOT link to LV book (1,618 books))
Book with non-empty Running-Time values - books with some Running Time values (15,247 books)
No_Wikipedia_Book_Page.xlsx - books without a wikipedia page (11,267 books)
All_LINKED.xlsx - all books with links from wikipedia (2.864 books)
All_NOTlinked.xlsx - all books lacking links from wikipedia (13,043 books)
==================================

There are probably some Excel sharpies out there who can modify what I'm sending to fit all their needs; if not, just describe to me how you'd like some data to be gathered and I'll create a View, query or pseudo-table to deliver what you need.
-- Bill Jones

When you think that you have exhausted all possibilities, remember this: you haven't.
--- Thomas Edison
williamjones
Posts: 2248
Joined: April 26th, 2016, 7:47 pm
Location: Florida

Post by williamjones »

Addendum: if there are any MS ACCESS people out there who would like to get a copy of my whole AllBooks database, code, tables and all, just let me know. I'll share.
-- Bill Jones

When you think that you have exhausted all possibilities, remember this: you haven't.
--- Thomas Edison
TriciaG
LibriVox Admin Team
Posts: 60512
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

BIG QUESTION:HOW CAN I GET TWO .ZIP FILES TO INTERESTED PARTIES.
The uploader takes zips. I assume they're under 100 MB each. :) You can use the ZZ-Nonproject folder.
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
Humor: My Lady Nicotine
Post Reply