How Often are Librivox Books Listened To?

Comments about LibriVox? Suggestions to improve things? News?
Post Reply
TedL
Posts: 569
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

Winnifred wrote: February 27th, 2024, 5:23 pm I've been trying (not terribly successfully) to follow this discussion. In theory, it sounds good (the end result seems like it would be a net positive), but it does sound like a lot of work and most of it would fall on our Admins (the only "management" that Librivox has). There are only 29 of them, and I'm not even sure that all of them are currently active. Plus, some might not be interested in this kind of work. The ones who are active are often coordinating multiple book projects of their own, reading for group and solo projects, and PL/DPLing, in addition to serving as Meta Coordinators for a lot of projects. I don't see how they'd have time to take on entering the requisite data into the New Project Generator and Internet Archive (possibly dumb question: do they even have access to do this last on IA?)

Below are my <queries/comments> on one section of your proposal:

"How LOC Subject Headings will be Entered into Audiobook Records
For new books (still in process), Team Leaders will copy the LOC subject headings into the tags box in the New Project Template Generator before the book <each book individually?> is completed. Book Coordinators will no longer enter anything into the tags box. <This would require Team Leaders to have access to the "New Project Template Generator" for each project, after the Book Coordinator has completed their part of the form, but before the BC posts the results to the New Projects Launch Pad. You'd have to reprogram the generator to allow for a multi-stage process like that. Wouldn't Team Leaders then have to be given a separate level of access or be Admins themselves? This additional step would add a layer of back and forth that would make setting up a solo or group project take longer and be even more complicated.>

For existing audiobooks already in the Internet Archive (IA) Librivox collection, Team Leaders will submit the list of LOC subject headings with their book title to one of the designated Librivox ‘Admins’ who are authorized to update book records. <So the Team Leaders would need to comb through (or direct their team to comb through) all books on IA, and set up a list of subjects for each book? That makes sense (though it's a lot of work).>

The Admin will update all versions of that book in the collection, by adding the subject headings to the ‘Topics’ field. Any subjects already in the ‘Topics’ field will be deleted. <This is rather tedious data entry work, isn't it? I have to wonder whether it'll be easy to get enough volunteer Admins to do it.>

Management needs to revise the instructions above the Tags box in the New Project Template Generator, and may need to revise Help documents referring to that form. <There is no "Management" here. The Admins are our management. They're all volunteers. So this would fall on them too.>

If there's a way to have AI generate the subject headings automatically somehow, that would reduce the amount of work involved, because manually assessing each of the books on IA to determine which subject heading applies seems to me to be a lot of work that could take an awfully long time. Most of the volunteers on here are doing what they do because they enjoy it. I'm not sure how many librarians and indexers we have around who would embrace this kind of task, and I know from experience that volunteer management/engagement is not an easy feat.

Sorry to rain on your parade, Ted, but I think the process might need some more refining to be workable.

Cheers,
No need to be sorry Winnifred. I don't own any stock in Librivox, and I have no stake in the organization. I don't even listen to audiobooks.

I spend much of my time growing and managing my website, a free online library that helps people find books and magazines at other sites on the web. Most links to books on my site are in the Internet Archive (IA) "Books to Borrow" collection. I use the hyperlinked topics on the IA books to lead my puny audience of less than 10,000 people per month to books on subjects they're interested in.

As a typical small website owner, I'm constantly tracking the volume of visits on all the pages of my site. It helps me improve the site and meet the needs of users. There are many software tools that enable website users to do that. I used one to which I subscribe, Ubersuggest, to do a brief analysis of the Librivox collection, looking a bit deeper into what at first glimpse appears to be a phenomenally successful organization, with 20 million views per month.

What I found was disturbing; that nearly all those 20 million views go to a tiny slice of the collection, and 90% of Librivox books are rarely heard by the public. The reason? Site visitors cannot access most of them.

The solution to the problem was obvious to me because the Internet Archive has already implemented it. They added hyperlinked LOC subject headings to most of the 4 million books in their "Books to Borrow" collection. That's 200 times as many books as Librivox has. Partly as a result of that project, traffic to books in that IA collection has gone from less than 1 million views per month 6 years ago to nearly 6 million per month now.

I'm not very familiar with the process Librivox uses to record books and post records, but I tried to imagine a process to duplicate in Librivox what Internet Archive is already doing. I was warned by longtime Librivoxers that I was wasting my time because managers are indifferent as to whether anyone listens to the books they create. But I've done what I see as my duty; exposed the problem and pointed out the proven solution. Management hasn't yet shown an interest in following up, but thus far around 1,500 people have read this thread and are now aware.
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

TedL wrote: February 28th, 2024, 2:47 am I don't even listen to audiobooks.
Not the worst sin in the world, Ted — but might explain what could be a bias in your thinking about audiobook versus text book 'readers'. I myself have borrowed or viewed text versions of books in the IA collection perhaps a dozen times, but always it has been because I have been seeking a particular piece of information or looking to check a particular passage in a book. I have never once read the entirety of an AI book online. By contrast, when I have listened to Librivox audiobooks, I have almost invariably listened to the whole book. I have never used Librivox audiobooks as a 'reference library', in other words. I don't know whether this different usage pattern for text books versus audiobooks that I experience is the same for other users, but if it is, it might imply that different approaches to searching for reading/listening material might apply in each case.
TedL wrote: February 28th, 2024, 2:47 am What I found was disturbing; that nearly all those 20 million views go to a tiny slice of the collection, and 90% of Librivox books are rarely heard by the public. The reason? Site visitors cannot access most of them.
I fail to see that there is anything in the least bit "disturbing" about the fact that 90% of Librivox books are rarely heard by the public. To me, it would be completely extraordinary if as many people listened to the third of Carlyle's three volumes on the French Revolution (which I have recorded) as listen to a Sherlock Holmes collection. There are an extraordinary number of old religious texts which, for some reason I personally don't understand, some people clearly like to record. Would I be surprised to learn that very few people listen to them — far fewer than listen to The Art of War? Most certainly not. Are you in all seriousness suggesting that the principal reason few people will ever listen to "Stopfkuchen - Eine See- und Mordgeschichte von Wilhelm Raabe, von Wilhelm Raabe" or "A Body of Practical Divinity, by Thomas Watson" or "Guide for Catholic Young Women: Especially For Those Who Earn Their Own Living, by Rev. George Deshon (1823 - 1903)" is simply that "readers cannot access most of them"? Might it not simply be the case that readers do not want to access most of them? Is that not at least an entirely plausible hypothesis?
TedL wrote: February 28th, 2024, 2:47 am Partly as a result of that project, traffic to books in that IA collection has gone from less than 1 million views per month 6 years ago to nearly 6 million per month now.
By your own reckoning, even if it could be proved that this jump in usage was entirely due to this initiative — which you concede it might not have been — the raw figure of 6 million per month is not the figure we should be looking at. Rather, by your reckoning, we should be asking ourselves "how many of those 6 million Internet Archive book views per month go to only a tiny slice of the collection"? You have not at all shown that the Internet Archive's initiative has had the kind of effect that you are seeking to achieve for Librivox at all.
TedL wrote: February 28th, 2024, 2:47 am I've done what I see as my duty; exposed the problem and pointed out the proven solution.
As a matter of principle, I'm personally all for exposing problems and pointing to proven solutions. I think that's a reasonable thing to attempt, even if, as it happens, you don't yourself listen to audiobooks.

I'm unconvinced, however, that you have in fact "exposed a problem" of any great weight, and I do not think the data you have cited here in any way indicates that you have come up with a "proven solution". If it is true that 90% of Librivox audiobooks are rarely listened to, I do not think that necessarily constitutes a problem that it is the responsibility of Librivox to "fix". The vast majority of new, contemporary books that are published every year struggle to find even a tiny audience, but that does not mean either that they should not be published, or that their failure to find an audience comes down to a problem of subject keywording — rather, it seems to be more the case that not many people want to read these books. As for the "proof" that your solution works, you have not indicated that improved keywording has led to a shift in the overall access pattern for books on Internet Archive, causing numbers of accesses for relatively obscure texts to rise at a faster rate than for more intrinsically popular texts. Nor have you accounted for the possibility that users may be more inclined to use a comprehensive library of text books as a reference library than they would typically use an audiobook library, and thus have, for a text library, rather more demanding search requirements.
redrun
LibriVox Admin Team
Posts: 2941
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

Ted,

I really do mean: thank you for your time. It's both the most free gift, and the most costly investment one can make.

I'd also like to make our books easier to browse, where I can do that without distracting from our mission. I've been reading with interest, and you've given me a couple of ideas for things I might be able to do on the side, myself. (If they work well, someone will notice. :lol: )

So while I'm not going to spend my volunteer time to recruit and organize a new volunteer track, and ask our existing volunteers to change their work-flow to accommodate... that's not because I don't care, or haven't been listening. I don't speak for other admins, but here's my take.

This may be an obvious proposal for some organizations (perhaps your suggestion is modeled on Open Library's "Librarian" volunteer track?), but I'm afraid it doesn't quite translate.
Though we care about many of the same things, LibriVox does not aim to be an actual library. Categorization and tagged searching are nice to have, and we do make an effort, but rather than focus on a library's mission, we are rightly more focused on our own: recording Public Domain books, and releasing those recordings to the Public Domain, where they can be freely categorized, shared, discussed and even sold by absolutely anyone else.

Honestly, I think the kind of free-form tags our Book Coordinators enter are often going to be more useful to everyday folks doing Google searches (that 20 million number), than LOC codes are. I'd love to have LOC codes too, and I'll see what it takes to add them to my own projects in future. Maybe other BCs and MCs will do the same, and maybe later we'll all decide that we like them so much, it's worth going back to add them to our existing catalog.

Would I like our projects to add LOC tags where convenient to do so? Yes! And this discussion has brought up some tools that might help with that, going forward.
Would I like even our non-standard tags to be more visible and functional on our own web site? Yes! And that's... somewhere on my list of technical side-projects. (But technical changes affect everyone, for better or for worse, so they need to be done right... and I'm no software engineer.) It all takes time, and all of our time is volunteer.
As, I realize, is yours. Thanks again for helping people find our works on your site. I can't promise we'll make it much easier for you, but you've given me something to think about.

Regards,
redrun
I'll be out for a bit on this last weekend of April, but still checking in as I get the chance. I will try to follow up on Monday, with anything I can't do on the go.
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

redrun wrote: February 28th, 2024, 10:28 pm Ted,

I really do mean: thank you for your time. It's both the most free gift, and the most costly investment one can make.

I'd also like to make our books easier to browse, where I can do that without distracting from our mission. I've been reading with interest, and you've given me a couple of ideas for things I might be able to do on the side, myself. (If they work well, someone will notice. :lol: )

So while I'm not going to spend my volunteer time to recruit and organize a new volunteer track, and ask our existing volunteers to change their work-flow to accommodate... that's not because I don't care, or haven't been listening. I don't speak for other admins, but here's my take.

This may be an obvious proposal for some organizations (perhaps your suggestion is modeled on Open Library's "Librarian" volunteer track?), but I'm afraid it doesn't quite translate.
Though we care about many of the same things, LibriVox does not aim to be an actual library. Categorization and tagged searching are nice to have, and we do make an effort, but rather than focus on a library's mission, we are rightly more focused on our own: recording Public Domain books, and releasing those recordings to the Public Domain, where they can be freely categorized, shared, discussed and even sold by absolutely anyone else.

Honestly, I think the kind of free-form tags our Book Coordinators enter are often going to be more useful to everyday folks doing Google searches (that 20 million number), than LOC codes are. I'd love to have LOC codes too, and I'll see what it takes to add them to my own projects in future. Maybe other BCs and MCs will do the same, and maybe later we'll all decide that we like them so much, it's worth going back to add them to our existing catalog.

Would I like our projects to add LOC tags where convenient to do so? Yes! And this discussion has brought up some tools that might help with that, going forward.
Would I like even our non-standard tags to be more visible and functional on our own web site? Yes! And that's... somewhere on my list of technical side-projects. (But technical changes affect everyone, for better or for worse, so they need to be done right... and I'm no software engineer.) It all takes time, and all of our time is volunteer.
As, I realize, is yours. Thanks again for helping people find our works on your site. I can't promise we'll make it much easier for you, but you've given me something to think about.

Regards,
redrun
These seem very wise remarks to me. I'd particularly want to second redrun's thanks to Ted for putting as much time into this question as he has.

redrun contemplates the possibility of adding LOC tags to his projects in future. For my own part, I know that I have certainly used the PG tags in the past, and I believe these are generally taken from the LOC catalog.

It's a relatively small issue in among the big picture matters that have been discussed here, but I should point out that right now we don't really have any effective way of allowing users to SEARCH librivox.org by many LOC tags.

Let me explain why I say this with an example. If you look up the LOC tags for James Joyce's novel "Ulysses", you will see that one of its several tags is "psychological fiction". If you do a keyword search for "psychological fiction" at librivox.org sorted by date, you will see, included on the first page of results, "The House at Pooh Corner". Why is this? Surely no-one has ever tagged "The House of Pooh Corner" at librivox.org as "psychological fiction"? Well no, they haven't — but it HAS been tagged as "fiction", and right now our keyword search is implemented using OR logic. In this case, it is returning a list of hits where an audiobook has either the keyword "psychological" OR "fiction", and there is no way of forcing it to do an exact match on the keyword phrase entered. So... if we ever do want to get serious about using LOC keywords, we're going to need to change that search behaviour to allow books to be found by multi-word tags.
InTheDesert
Posts: 7786
Joined: August 20th, 2019, 8:25 pm

Post by InTheDesert »

TheBanjo wrote: February 29th, 2024, 12:25 am If you do a keyword search for "psychological fiction" at librivox.org sorted by date, you will see, included on the first page of results, "The House at Pooh Corner". Why is this?
I think you might be underestimating the complex inner life of Tigger explored in chapter 2.
Female Scripture Characters by William Jay (1769 - 1853) 97% 1 left! "The Penitent Sinner Part 2"
St. Augustine (Vol.6 Psalms 126-150) 94% 3 left!
PL pls: DPL 43 27-28
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

Yeah, I know. Tigger is your classic “manic”, Eeyore your classic “Depressive”, and Owl has just a touch of Sigmund Freud about him — but the audiobook is certainly NOT tagged “psychological fiction” at Librivox!!
TedL
Posts: 569
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

redrun wrote: February 28th, 2024, 10:28 pm Ted,

I really do mean: thank you for your time. It's both the most free gift, and the most costly investment one can make.

I'd also like to make our books easier to browse, where I can do that without distracting from our mission. I've been reading with interest, and you've given me a couple of ideas for things I might be able to do on the side, myself. (If they work well, someone will notice. :lol: )

So while I'm not going to spend my volunteer time to recruit and organize a new volunteer track, and ask our existing volunteers to change their work-flow to accommodate... that's not because I don't care, or haven't been listening. I don't speak for other admins, but here's my take.

This may be an obvious proposal for some organizations (perhaps your suggestion is modeled on Open Library's "Librarian" volunteer track?), but I'm afraid it doesn't quite translate.
Though we care about many of the same things, LibriVox does not aim to be an actual library. Categorization and tagged searching are nice to have, and we do make an effort, but rather than focus on a library's mission, we are rightly more focused on our own: recording Public Domain books, and releasing those recordings to the Public Domain, where they can be freely categorized, shared, discussed and even sold by absolutely anyone else.

Honestly, I think the kind of free-form tags our Book Coordinators enter are often going to be more useful to everyday folks doing Google searches (that 20 million number), than LOC codes are. I'd love to have LOC codes too, and I'll see what it takes to add them to my own projects in future. Maybe other BCs and MCs will do the same, and maybe later we'll all decide that we like them so much, it's worth going back to add them to our existing catalog.

Would I like our projects to add LOC tags where convenient to do so? Yes! And this discussion has brought up some tools that might help with that, going forward.
Would I like even our non-standard tags to be more visible and functional on our own web site? Yes! And that's... somewhere on my list of technical side-projects. (But technical changes affect everyone, for better or for worse, so they need to be done right... and I'm no software engineer.) It all takes time, and all of our time is volunteer.
As, I realize, is yours. Thanks again for helping people find our works on your site. I can't promise we'll make it much easier for you, but you've given me something to think about.

Regards,
redrun
I appreciate your note Redrun. Regards,
SowasVon
Posts: 205
Joined: January 24th, 2022, 5:00 pm

Post by SowasVon »

Of course Librivox is not a library, and "not enough manpower available to tackle this" is definitely an argument against change. If it's too much, then it's too much.

The negativity with regard to improving book-finding bothers me a teensy bit though.
I sometimes listen to Librivox books myself. In my native language, there are less books than in English, so when I saw that a category had only, say, 20 books, that made it appear as though it's simply all Librivox had to offer for it. But then I found to my surprise that books from the same genre were in yet another category, and that category names in general were somewhat arbitrary and that it was similar for other genres. That was annoying. If I'm interested in e.g. "crime novels", then I don't want to browse through pages of categories to find where else they might be listed.
I cannot solve this problem, but I agree that it is a problem.
"You're on Librivox? Pffft. You just like to hear yourself talk."
"Yuuuup." :mrgreen:
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

It would be possible, at least in theory, for a third party such as TedL to build an independent catalog of Librivox audiobooks where that catalog contained all the fields currently exposed by the Librivox API (https://librivox.org/api/info) plus any such search fields as that third party chose to populate (say, for example, an LOC keyterms field). Sure, it would be quite an effort, but it needn't involve any interaction with Librivox personnel at all.

Right now, the Librivox API does not expose the key terms currently associated with each audiobook. This strikes me as an odd omission. I have just put a request on Github to have that field added to the list of fields returned by the API.
knotyouraveragejo
LibriVox Admin Team
Posts: 22132
Joined: November 18th, 2006, 4:37 pm

Post by knotyouraveragejo »

Like the advance search option in the catalog, the API was never fully finished when the Mellon Project wrapped up. The shortcomings of the API has its very own, very long forum thread here:

viewtopic.php?t=44129
Jo
TedL
Posts: 569
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

I used the info@archive.org address (the only contact info know about for Internet Archive) to ask this, among other questions:

"Most of the books in Books to Borrow now have hyperlinked “topics” in LoC format. This must have been a massive effort. Who did it? Was it volunteers at IA, and if so, were they librarians? Or was it done at libraries that contributed the books? I’d love to have any information that’s available about this project."

For the first time, I got a response from them longer than one sentence:

"We cannot discuss the inner working of the organization. The metadata is programmatically ingested from LoC by us by agreement with LoC. We would not do that for items uploaded by users and, the system only uses these records for text items."

LoC is Library of Congress.

Maybe this info is useful to someone.
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

Because Librivox exposes its own API (though one, alas, that does not provide any keyterm information) and because Project Gutenberg make their whole catalogue available as a spreadsheet, it turns out to be a relatively simple exercise to iterate through the Librivox catalog, extract corresponding keyterms from the Project Gutenberg catalog for Librivox audiobooks that cite a Project Gutenberg source, and spit out the results as a spreadsheet. Here is that spreadsheet: https://drive.google.com/file/d/1-0LIq7K3wDBvync73my9VWm3dcXbHQ-Q/view?usp=sharing You will see that it contains Project Gutenberg keyterms for around 10,670 audiobooks in our collection that cite a Project Gutenberg text as their source.

Armed with this spreadsheet, I imagine it would be relatively simple for Librivox programmers to update the keyterms in the catalog at librivox.org if a decision was taken to do this. Options when doing so could include (a) adding the keyterms in this spreadsheet to the existing keyterms in the librivox.org catalog (b) entirely replacining the existing keyterms in the librivox.org catalog with the keyterms in this spreadsheet (c) merging the two sets of keywords by, for example, dropping any keyterms currently in the librivox.org catalog that are duplicated in, or appear in, the PG set for the same book, but retaining any that don't.

Two larger issues, with not so simple a solution, relate to how, if this change were implemented, users would be able to search against these keyterms.

First, our existing keyterms search facility is woefully inadequate (as I have mentioned in a recent posting in this forum under a different heading) and, to be useful, really needs to provide such options such as "search for this exact keyterm" or "search for audiobooks whose keyterms include any of the following comma-separated phrases". The current search facility returns junk when a keyterm included more than two space-separated words is entered.

Second, right now users of our catalog have no way of seeing what keyterms are associated with each ebook. If they can't see that information, it will be very difficult to guess at what might be intelligent keyterms to enter when searching for another audiobook that might be about a similar topic.
Last edited by TheBanjo on March 1st, 2024, 1:16 pm, edited 2 times in total.
TheBanjo
Posts: 1309
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

knotyouraveragejo wrote: February 29th, 2024, 5:12 pm Like the advance search option in the catalog, the API was never fully finished when the Mellon Project wrapped up. The shortcomings of the API has its very own, very long forum thread here:

viewtopic.php?t=44129
.

Wow! That's really interesting. Looks like I may have just wasted a few hours building a spreadsheet of PG keyterms for our audibooks with PG sources, when in fact, as I have learned from the thread you just pointed me to, Internet Archive exposes an API that would (presumably) allow one to grab all their keyterms programmatically. TedL (separate post in this thread) has just identified that they use LOC keyterming. All we'd have to do is strip out their keyterms (from memory) "librivox" and "audibook", and we'd be effectively keyterming our whole collection using LOC keyterms. That is, of course, if we thought it was worth the effort to do so. And I most certainly appreciate that is a real consideration.

I must say, the more I look at this, the clearer it becomes that we actually have ready access to a very decent body of authoritative keyterms that we could pretty easily, programmatically, associate with our own librivox.org catalog. The big question in my mind would be: is there enough value for us in doing this to warrant the effort, when a user could just as easily search Internet Archive right now to find the same information? I guess, in a sense, that comes down to what we see our mission as being: creating audiobooks, publishing audiobooks, or creating AND publishing audiobooks. My sense is that we're mainly in the "creating audiobooks" game. And that's, personally, where I'm at.
TedL
Posts: 569
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

TheBanjo wrote: March 1st, 2024, 12:58 pm Because Librivox exposes its own API (though one, alas, that does not provide any keyterm information) and because Project Gutenberg make their whole catalogue available as a spreadsheet, it turns out to be a relatively simple exercise to iterate through the Librivox catalog, extract corresponding keyterms from the Project Gutenberg catalog for Librivox audiobooks that cite a Project Gutenberg source, and spit out the results as a spreadsheet. Here is that spreadsheet: https://drive.google.com/file/d/1-0LIq7K3wDBvync73my9VWm3dcXbHQ-Q/view?usp=sharing You will see that it contains Project Gutenberg keyterms for around 10,670 audiobooks in our collection that cite a Project Gutenberg text as their source.

Armed with this spreadsheet, I imagine it would be relatively simple for Librivox programmers to update the keyterms in the catalog at librivox.org if a decision was taken to do this. Options when doing so could include (a) adding the keyterms in this spreadsheet to the existing keyterms in the librivox.org catalog (b) entirely replacining the existing keyterms in the librivox.org catalog with the keyterms in this spreadsheet (c) merging the two sets of keywords by, for example, dropping any keyterms currently in the librivox.org catalog that are duplicated in, or appear in, the PG set for the same book, but retaining any that don't.

Two larger issues, with not so simple a solution, relate to how, if this change were implemented, users would be able to search against these keyterms.

First, our existing keyterms search facility is woefully inadequate (as I have mentioned in a recent posting in this forum under a different heading) and, to be useful, really needs to provide such options such as "search for this exact keyterm" or "search for audiobooks whose keyterms include any of the following comma-separated phrases". The current search facility returns junk when a keyterm included more than two space-separated words is entered.

Second, right now users of our catalog have no way of seeing what keyterms are associated with each ebook. If they can't see that information, it will be very difficult to guess at what might be intelligent keyterms to enter when searching for another audiobook that might be about a similar topic.
Great work! Regarding the question of how do users know what keywords to use: Yes, using LOC keywords is much more difficult than using "common sense" keywords, and librarians have spent a lot of time and effort trying to come up with alternatives. In my opinion it is better that we don't get involved in that quest. I think your approach of making keyword phrases for a book visible to the user is the right approach. They find a book in their subject, it gives them the right search term, and they search for all books in that category.

Maybe this is apples and oranges, but I wonder if the 'fix' you have in mind would also work in the Librivox YouTube page. They have the same issue as the two webpage sites: Lots of people read books that were recently published, but when they can no longer scroll down easily to them, books with unfamiliar titles or authors are no longer seen. Again, a subject search would help a lot.
TedL
Posts: 569
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

TheBanjo wrote: March 1st, 2024, 1:08 pm
knotyouraveragejo wrote: February 29th, 2024, 5:12 pm Like the advance search option in the catalog, the API was never fully finished when the Mellon Project wrapped up. The shortcomings of the API has its very own, very long forum thread here:

viewtopic.php?t=44129
.

Wow! That's really interesting. Looks like I may have just wasted a few hours building a spreadsheet of PG keyterms for our audibooks with PG sources, when in fact, as I have learned from the thread you just pointed me to, Internet Archive exposes an API that would (presumably) allow one to grab all their keyterms programmatically. TedL (separate post in this thread) has just identified that they use LOC keyterming. All we'd have to do is strip out their keyterms (from memory) "librivox" and "audibook", and we'd be effectively keyterming our whole collection using LOC keyterms. That is, of course, if we thought it was worth the effort to do so. And I most certainly appreciate that is a real consideration.

I must say, the more I look at this, the clearer it becomes that we actually have ready access to a very decent body of authoritative keyterms that we could pretty easily, programmatically, associate with our own librivox.org catalog. The big question in my mind would be: is there enough value for us in doing this to warrant the effort, when a user could just as easily search Internet Archive right now to find the same information? I guess, in a sense, that comes down to what we see our mission as being: creating audiobooks, publishing audiobooks, or creating AND publishing audiobooks. My sense is that we're mainly in the "creating audiobooks" game. And that's, personally, where I'm at.
You're not wrong about "our" mission, but maybe it depends on how you look at this. Librivox already gets some 600 volunteers a year. I've looked at our forums and info for volunteers, and it doesn't seem to me that Librivox focuses on trying to recruit volunteers with the skills and inclination to work on databases and programming and website development. I think that like most other organizations, we could have departments that specialize in issues like this. That would take this burden off the people who want to focus on audiobook production.

And as I've said before, I think most readers would appreciate knowing that Librivox management is doing all it can to ensure that their books are being put in front of as many readers as possible, for years after they are published. Speaking as a reader, that matters to me.
Post Reply