How Often are Librivox Books Listened To?

Comments about LibriVox? Suggestions to improve things? News?
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

InTheDesert wrote: March 3rd, 2024, 7:29 am If you parse the referer data at the end of the IA API views data, you get some interesting information.
Quite possibly of real value when they need to approach philanthropists for grant moneys.
redrun
LibriVox Admin Team
Posts: 2941
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

Referring to the big post on the last page:

1. I believe that Archive search change has more to do with the URL you were on when you started the search, than any sudden change on Archive's part.

2. 'LibriVox Management' in this case being... me sending a pull request, which anyone can do in the way I explained, but which is not as simple as your Wordpress site. I've probably gone more than halfway to figuring it out in detail, so I may well finish. I wanted to learn, anyway...

3. You can believe what you want about Google, I want no part in it myself. There are alternatives, which can be discussed if we even agree that we want tracking and are going to make use of it for... something.

4. Elijah has told you that no, that YouTube Channel, while operated by someone who has an account here, is not, and never has been, official. I've explained in some detail why very few people volunteer to work on our code. No, we don't have a database admin, we have a sysadmin with a growing family and a full-time job. And me, and whoever else can both figure it out and put up with it.
Tricia has already told you that the 2013 site rewrite went over on budget and under on expected features... but, it also gave us the tools we've relied on for so long. A hired gun would want to throw out the whole pile and rewrite it, but there's no guarantee we'd get everything back. Likely even if we did, we'd trade the old quirks we're used to, for new ones we'd have to figure out all over again. I've only gotten started, and all of this would take way more time than you seem to realize from doing what we do.

Not understanding why things are complex does not make them any less so, and no matter how big a visitor count you either guesstimate or precisely track, LibriVox is a volunteer project for readers which happens to have a site, not a web site for audio books which are produced by volunteers.


So for your next questions:
A) Eventually, maybe. Someone has to learn how to do that safely. I'm assuming since you are not a web developer, you are not volunteering to apply any of this on the codebase I linked you to. If someone else does, then I will certainly help test, because I'd like to see it happen.

B) Since many that didn't come from Gutenberg may have come from Open Library, that would be another source for tags. Everything else would be either manual, or a grab-bag.

C) Me learning professional-level skills, probably, unless someone else jumps in with code (which I'll help test!). See point 4 above.

D) Hm. Interesting idea. I wouldn't want to put an ebook search link in our audiobook search page. We also have a Help page (did you know we had one of those?) here:
https://librivox.org/pages/help/#5
...but that seems like the more elementary instructions. I don't know where I'd put it - but not on the home page, and a brand new page should have more of a purpose than saying "you can search by LoC tag at Gutenberg, and then come back here for audio editions." :hmm:

Edited to add:
Are LV audiobook records the same at the IA collection as at the Librivox site? Or would importing subject headings be separate problems/projects at the two sites?
They are separate. Some of the entries on our database (including the tags) are carried over when we first catalog a completed book. After that, any changes would need to be done on both sides.
I'll be out for a bit on this last weekend of April, but still checking in as I get the chance. I will try to follow up on Monday, with anything I can't do on the go.
TedL
Posts: 570
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

At Internet Archive, you can now search for a term in the Librivox collection and the results will all be Librivox audiobooks. Then you can hover over the books and see the topics with hyperlinks. A big step forward! However, clicking on the linked topics still searches the 4-million book library for all media. Those links should also only search the Librivox collection.

Can anyone confirm if a text and a link will be added to the Librivox.org site that leads users to the Internet Archive Librivox collection to carry out a subject search?
redrun
LibriVox Admin Team
Posts: 2941
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

TedL wrote: March 5th, 2024, 9:17 am At Internet Archive, you can now search for a term in the Librivox collection and the results will all be Librivox audiobooks. Then you can hover over the books and see the topics with hyperlinks. A big step forward!
I really don't think this was a change: to the best of my knowledge, this has been one of several search modes for quite some time. (Edited)

TedL wrote: March 5th, 2024, 9:17 am However, clicking on the linked topics still searches the 4-million book library for all media. Those links should also only search the Librivox collection.
The tags on archive.org are working as the folks at archive.org intend them to. PLEASE do not contact them requesting that they change things for us.
I would love to put similar functionality on our site, but I don't see it happening soon. No promises.

TedL wrote: March 5th, 2024, 9:17 am Can anyone confirm if a text and a link will be added to the Librivox.org site that leads users to the Internet Archive Librivox collection to carry out a subject search?
This one's simpler, and I still can't promise anything. Of the many items on our wish-list, I can at least say I know mostly how to do this one "right" in the context of this wonky code-base of ours. More importantly, I know enough to be sure I won't break something in what's left of the guess-work.
Since I also think this link would be useful, this puts it on my own short-list of "technical work", for when I'm not otherwise: recording, editing, proof-listening, coordinating, cataloging, or answering questions.
Then our volunteer sysadmin can review and apply the code in his own time. It will probably happen, possibly soon, but definitely not today.
I'll be out for a bit on this last weekend of April, but still checking in as I get the chance. I will try to follow up on Monday, with anything I can't do on the go.
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

It's been a pretty steep learning curve for me (have never heard of ansible or CodeIgnite before, and have never used git before in earnest), but I have now created a pull request to have some text added below the Advanced Search Submit button pointing users to the IA search facility (see https://github.com/LibriVox/librivox-catalog/pulls). Look forward to seeing what happens to this little change proposal from here... one day.
TriciaG
LibriVox Admin Team
Posts: 60810
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

TheBanjo wrote: March 10th, 2024, 1:40 am It's been a pretty steep learning curve for me (have never heard of ansible or CodeIgnite before, and have never used git before in earnest), but I have now created a pull request to have some text added below the Advanced Search Submit button pointing users to the IA search facility (see https://github.com/LibriVox/librivox-catalog/pulls). Look forward to seeing what happens to this little change proposal from here... one day.
Such a pull request was already merged yesterday: https://github.com/LibriVox/librivox-catalog/pull/203
School fiction: David Blaize
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
redrun
LibriVox Admin Team
Posts: 2941
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

Happens to have been fortuitous timing. I was thinking I might have to experiment with writing a new CSS class in order to get a decent look for it, which would have taken more time to figure out what's what.
I also wasn't sure when it could be reviewed, what would be requested in the review (I also went with target=_blank, but we don't use that elsewhere), or when the change could be merged or deployed after passing review.

All this to say: sorry for 'ninja-ing' you, but I knew better than to promise anything. And now it looks like we'll get the link slipped in with some system and software updates that were already slated to happen sometime soon.
I'll be out for a bit on this last weekend of April, but still checking in as I get the chance. I will try to follow up on Monday, with anything I can't do on the go.
TedL
Posts: 570
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

Great job! The new referral from the Advanced Search page to the Internet Archive Librivox collection is now a reality. Thanks very much!
TedL
Posts: 570
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

I posted 2 pages closely related to this thread on my website. Everyone is welcome to visit them.

Search Internet Archive for Librivox Audiobooks at https://centurypast.org/search-internet-archive-for-librivox-audiobooks/ .

Make Sure Your Audiobook is Heard for Years at https://centurypast.org/make-sure-your-book-is-heard-for-years/ .

Thank you.
redrun
LibriVox Admin Team
Posts: 2941
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

Well, this does indeed look like a way to "maximize" the listens a given book receives: pick a popular book, and make it easy to find! Solid advice, for when that is your goal.

I also appreciate that you've broken out the search options in such detail. That looks like a great resource, and I will likely send folks there, if they ask me how to find something in particular. :thumbs:

But at the risk of belaboring a couple of things that were side-points in your articles:

1. Please don't make this out to be a big "LibriVox Management recently recognized" thing.
You got to watch TheBanjo and I figure out what goes where to add the link, and other admins have known the search was not working to spec since the rewrite in 2013!
Point being: like many other improvements people have wanted for a long time, it was not a matter of "recognizing" or "deciding". Somebody, anybody, had to volunteer to put in the work, in a totally different domain of expertise.

2. It is, very very explicitly, NOT "the Book Coordinator's responsibility to ensure that the audiobook is found by as many individuals and audiobook website managers as possible." That bears saying again. We do not ask or expect Book Coordinators to choose "popular" books, or to market them to listeners in any way.
Some people may choose to read popular books, yes. Some may choose to read them because they are popular, and having a wide listenership is again a goal some of our readers have. Personally, I would even encourage BCs to be thoughtful with their choice of tags, in most of the ways you suggest.
But LibriVox as an "organization" does not have "make sure at least X people listen to each of our recordings" as a goal, and neither do we ask BCs to market or advertise the books they coordinate.
I'll be out for a bit on this last weekend of April, but still checking in as I get the chance. I will try to follow up on Monday, with anything I can't do on the go.
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

For some screenshots (plus a little commentary) showing how I've been able to mock up, on a local copy of librivox.org on my PC, keyterm functionality that more or less mirrors the way archive.org exposes and hyperlinks keyterms associated with our audiobooks, see the PDF file I have uploaded in my initial posting to this GitHub "issue": https://github.com/LibriVox/librivox-catalog/issues/210

There's an old saying to the effect that "you don't know what you don't know", and when it comes to programming in an Open Source environment using git, CodeIgniter and AJAX, none of which I've ever had any familiarity with, that's certainly true for me. Perhaps what I'm proposing in this PDF file is not, in fact, a good idea for any of a host of possible reasons that have not yet occurred to me. However, I do hope at least some who have been monitoring this topic for the last few weeks will at least have a look at the PDF file I've mentioned, and have a think about whether the approach mooted in it (well, not just mooted, but actually implemented on my own PC) would be worth adopting, perhaps with yet further modifications, in our "real" system as a way of helping catalog users identify other audiobooks in our collection which may be thematically related to one in which they already have an interest.
TriciaG
LibriVox Admin Team
Posts: 60810
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

My initial thoughts are positive, but I have some questions/reservations/thoughts.

1. Why do you want to change "keywords" to "keyterms"? Is that just personal preference? Note that any wording changed on the template generator requires us to get translations in all the languages and change all the strings that contain that term. I don't consider that part of the proposal to be worthwhile.

2. While on your local device, the project count for each keyword comes up quickly, I'd be concerned that it would slow things down on the live server when searching through and compiling results from the entire database, in real time.

3. If this is implemented, we would have to limit the number of keywords people enter, or the catalog pages would be overrun! One nice thing about them being hidden is that the admins don't have to police them. :lol:

4. While the clicking a keyword pulling up the other projects with the same word is a nice feature, it would obviously give very broad results. If someone wants to narrow down the search, they'd need to use keywords in the advanced search (an "and" search), if and when the search functionality is improved. :hmm: Right now, clicking on "History" would give you thousands of results, whereas in the advanced search, one should be able to put in "History" and "South America" to narrow their results.
School fiction: David Blaize
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
InTheDesert
Posts: 7786
Joined: August 20th, 2019, 8:25 pm

Post by InTheDesert »

TriciaG wrote: March 19th, 2024, 6:05 am 3. If this is implemented, we would have to limit the number of keywords people enter, or the catalog pages would be overrun! One nice thing about them being hidden is that the admins don't have to police them. :lol:
What if was hidden by default and the user had to click on 'view keywords' for it to appear?
Female Scripture Characters by William Jay (1769 - 1853) 97% 1 left! "The Penitent Sinner Part 2"
St. Augustine (Vol.6 Psalms 126-150) 94% 3 left!
PL pls: DPL 43 27-28
TriciaG
LibriVox Admin Team
Posts: 60810
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

That would be a lot of programming. I'm not sure it's feasible.
School fiction: David Blaize
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

TriciaG wrote: March 19th, 2024, 6:05 am My initial thoughts are positive, but I have some questions/reservations/thoughts.

1. Why do you want to change "keywords" to "keyterms"? Is that just personal preference? Note that any wording changed on the template generator requires us to get translations in all the languages and change all the strings that contain that term. I don't consider that part of the proposal to be worthwhile.
It's a little more than personal preference, I think. I don't see that "History -- South American" is a word. To me that's more a term.

I take your point about changing all the translations for the template generator not being worth it. I think the change is worth it in the other places I have indicated however, because of what I (at any rate!) see as a clear distinction in meaning between word and term.
TriciaG wrote: March 19th, 2024, 6:05 am 2. While on your local device, the project count for each keyword comes up quickly, I'd be concerned that it would slow things down on the live server when searching through and compiling results from the entire database, in real time.
I have no expertise in this area. I do recall some years ago suggesting that searching our catalog by reader was incredibly slow, and asking if someone could look at whether the relevant table columns had been indexed in the database — and then subsequently noticing that the speed of these searches changed dramatically. I don't know if that improvement came about as the result of indexing a previously unindexed column or so, or in some other way.

I would point out that the computing expense here occurs only when a user displays a book's catalog page, not a listing of books. From what I've seen while I've been logging MariaDB calls as I've been working on this, there are vastly more database calls being made when a user is generating a listing of books than when displaying the catalog page for a single book. I would also mention that showing the numbers in brackets does go some way to addressing the issue you raise below, at point 4. If you can see on a book's catalog page that it has the keyterm "history" but that 2447 other books have the same keyterm (and I did just make up that figure), you'd have to be pretty stupid to click the link for that keyword and imagine you were going to be "narrowing down your search". If we don't show the numbers, we're making it much harder for users to see if clicking that link is likely to yield a meaningful result or not.
TriciaG wrote: March 19th, 2024, 6:05 am 3. If this is implemented, we would have to limit the number of keywords people enter, or the catalog pages would be overrun! One nice thing about them being hidden is that the admins don't have to police them. :lol:
Whether a catalog page might be "overrun" could be a legitimate consideration, I guess. But then, how strict are we in policing such a consideration when it comes to a project description, for which no particular limit seems to be in place. If your concern is with the risk of an overcluttered screen, it would be quite easy in code to implement a rule to the effect that we display only, say, the first ten (or five, or fifteen, or whatever) of all the keyterms associated with this book, with the list sorted by frequency of occurrence, as in my screenshots.

I can see that keeping keyterms hidden from ordinary users' view has benefits for administrators in not having to "police" them, but the obvious downside of this "benefit" is that it becomes quite impractical for users to search by keyterms, as they won't have a clue what possible keyterms to enter.

If you're really seriously concerned about the issues you've raised under point 3, the easiest and best solution would be to do away with keyterms altogether.
TriciaG wrote: March 19th, 2024, 6:05 am 4. While the clicking a keyword pulling up the other projects with the same word is a nice feature, it would obviously give very broad results. If someone wants to narrow down the search, they'd need to use keywords in the advanced search (an "and" search), if and when the search functionality is improved. :hmm: Right now, clicking on "History" would give you thousands of results, whereas in the advanced search, one should be able to put in "History" and "South America" to narrow their results.
You've not, of course, had the benefit of playing around with a working example of this approach, as I have done. If you had done so, I don't think you would suggest that using keyterm hyperlinks "would obviously give very broad results". Yes, if you click a keyterm you can see has been used in 400 other books, you are going to get "broad results", but in the quick browsing around I've done, I've been surprised by how often a helpful looking multiword keyterm comes up that is used in 30 or 40 other books. And sometimes, for one book, three or four such interesting keyterms have been supplied, often pointing to quite small groups of books.

Even if we imagine for a moment that the advanced search feature has been enhanced in the way you describe, it's still going to be hard for users to get full value from it if they have no visibility into what kinds of keyterms are currently associated with our audiobooks.

If I had to choose between (a) a world where we had the advanced search capability you have described yet with no way of seeing all the keyterms associated with an audiobook, or (b) something very like what I've implemented here, I'd personally prefer scenario (b) — while of course acknowledging that having both (a) and (b) would be best of all.

Overall, though, thanks very much for your thoughtful consideration of what I'm suggesting here.
Post Reply