How Often are Librivox Books Listened To?

Comments about LibriVox? Suggestions to improve things? News?
redrun
LibriVox Admin Team
Posts: 3186
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

TedL wrote: February 20th, 2024, 5:21 am Librivox.org

Here are some metrics on visits to Librivox.org, and visits to the book pages within the site. I used my subscription to Ubersuggest at <link removed> to get these metrics, because Librivox does not put them on its website.
Technical point here, as one of those volunteers: LibriVox doesn't share these numbers partly because LibriVox doesn't collect them.

This service gets its information from people being tracked (read: spied on) as they browse the internet. Many (most?) sites cooperate and help track their users, but LibriVox does not, and I wouldn't ever be a party to changing that. This means you should take their information with a generous pinch of salt.

We also don't keep page view counts on our own - the only way to be sure it doesn't track who is viewing which pages. That would take space in a database, and volunteer time to implement, as Tricia mentions.

Edited: I'd rather not link or directly mention commercial sites more than necessary. I've removed the link and some of the mentions from my post.
TedL
Posts: 599
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

redrun wrote: February 20th, 2024, 6:50 am
TedL wrote: February 20th, 2024, 5:21 am Librivox.org

Here are some metrics on visits to Librivox.org, and visits to the book pages within the site. I used my subscription to Ubersuggest at https://app.neilpatel.com/ to get these metrics, because Librivox does not put them on its website.
Technical point here, as one of those volunteers: LibriVox doesn't share these numbers partly because LibriVox doesn't collect them.

Ubersuggest gets its information from people being tracked (read: spied on) as they browse the internet. Many (most?) sites cooperate and help track their users, but LibriVox does not, and I wouldn't ever be a party to changing that. This means you should take Ubersuggest's information with a generous pinch of salt.

We also don't keep page view counts on our own - the only way to be sure it doesn't track who is viewing which pages. That would take space in a database, and volunteer time to implement, as Tricia mentions.
Thanks for your feedback. I checked a couple of other sites besides Ubersuggest to see what they said about Librivox traffic, and they're pretty consistent. I have now added a mention of those sites in my post.

I opened a recent Librivox.org book page, for "To the Clouds", and found this: UA-1429228-8. Its near the bottom of the head. I think its the Google Analytics Tag used to track page views.
redrun
LibriVox Admin Team
Posts: 3186
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

TedL wrote: February 20th, 2024, 7:37 am I opened a recent Librivox.org book page, for "To the Clouds", and found this: UA-1429228-8. Its near the bottom of the head. I think its the Google Analytics Tag used to track page views.
This does indeed look like Google Analytics. I have a blocker in place for this, but it usually also tells me there was something to block. Something to look into. :|

Honestly, I'll ask other admins about removing this. Libraries make a big point of not tracking their patrons, for several good reasons. I don't think we need to be helping Google sell information about anyone's reading habits, even if we do also use their analytics tools for some other, more benevolent purpose I'm unaware of.

I had a look at your traffic estimation tool's FAQ, though, and reading between the lines, it looks like they A) know strictly less than Google does, and B) have to engage in something like "20 Questions" to guess at whatever Google does know. Other services might well do the same thing, explaining their similar but different estimates.
TedL
Posts: 599
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

redrun wrote: February 20th, 2024, 8:07 am
TedL wrote: February 20th, 2024, 7:37 am I opened a recent Librivox.org book page, for "To the Clouds", and found this: UA-1429228-8. Its near the bottom of the head. I think its the Google Analytics Tag used to track page views.
This does indeed look like Google Analytics. I have a blocker in place for this, but it usually also tells me there was something to block. Something to look into. :|

Honestly, I'll ask other admins about removing this. Libraries make a big point of not tracking their patrons, for several good reasons. I don't think we need to be helping Google sell information about anyone's reading habits, even if we do also use their analytics tools for some other, more benevolent purpose I'm unaware of.

I had a look at your traffic estimation tool's FAQ, though, and reading between the lines, it looks like they A) know strictly less than Google does, and B) have to engage in something like "20 Questions" to guess at whatever Google does know. Other services might well do the same thing, explaining their similar but different estimates.
My understanding from Google Analytics and other sites is that the default info that you get from GA is so minimal that it is not possible for a site owner to track an individual. For my site it reports the number of hits and the region they come from. I don't use Google Analytics much, but I use Google Search Console every day, and their data comes from Analytics. I could not run my website without traffic data.

I believe you can ask for data on gender, age range, etc, but whatever info they report has been approved by the EU, which has much stricter privacy standards than the U.S. Personally, I think Analytics data collection is less of a concern than it used to be. But with a little research you should be able to find that out.
TedL
Posts: 599
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

TriciaG wrote: February 20th, 2024, 6:19 am
I think the solution is to improve visitors' access to the full body of the collection, so they aren't limited to browsing through the most recent or the most popular books. I suggest that making subject searches work the same way within the Librivox collection as they now work within the Internet Archive ''Books to Borrow" collection would be a solution. I'll cover this in more detail in a future post.
Keep in mind that any changes to the back end (how things like searches are done) is highly dependent on getting volunteer programmers to develop and do the work. And those volunteers are very few and far between - not only do they need to have the time and inclination, they need to be able to work through and understand our "flaming pile of ----" (as one developer close to me colourfully called the code) to change it. We've got a couple people who are currently working on some of the low-hanging fruit, but we've got dozens of changes we'd like to see done (some of higher priority than others). Feel free to make your proposals, but keep in mind that the chances of them being implemented - especially in a timely manner - are not high.
My proposal will call for two things:
1. Having searches in the Librivox collection at Internet Archive be done within the Librivox collection, and not the wider book collection, and;
2. Having volunteers edit the subjects, or topics, in book entries.

Feb 25 Update

I learned from Internet Archive that a person can easily search within only their Librivox collection, rather than their entire site collection, by going to "Collections" in the left column and clicking the Librivox collection.

End Feb 25 Update

#2 will indeed be fairly labor intensive, but maybe there are potential volunteers who don't wish to be readers or listeners. However, I think we would only need only about two people to actually open records to edit the subjects. I'd love to have at least one experienced librarian involved.

I'm more concerned about making the changes at IA than at Librivox, because they will have a bigger impact there. But at Librivox, we should think about what it would take to do the same subject searches as at IA.

I'm going to wait a couple of days to release my proposal. I'm still working on it.

Note that this should have a big impact on traffic, and revive thousands of Librivox audiobooks that are now never heard. It might be worth temporarily diverting some volunteer time from the ongoing effort to bring more public domain books into the collection. At our present rate of 1,000 books a year, it will take 2,000 years to record all the public domain books that are just at Internet Archive (roughly 2 million). Is it a big deal if we extend that by 20 or 30 years?
Last edited by TedL on February 25th, 2024, 1:16 pm, edited 1 time in total.
redrun
LibriVox Admin Team
Posts: 3186
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

TedL wrote: February 20th, 2024, 11:02 am Personally, I think Analytics data collection is less of a concern than it used to be. But with a little research you should be able to find that out.
It's slightly less of a concern now that they don't directly sell the information, yes. Most folks don't know how much data Google has on them, and many probably wouldn't care if they did. But some do, and some certainly would.
I'm sure the limited information GA reports to site owners is still quite useful for sites that place advertisements, or make technical decisions based on viewer statistics, but we don't really do either here. That's why I'd much rather we didn't enable this data collection any more than we have to.

Sorry for the long tangent. Back to the main topic, if we've made the recordings and shared them, we've done our primary mission. I'm all for making it easier for folks to find them after that, but there are other sites that do that, as you've noted. I'll watch to see the proposal. Thank you for your time and interest, in any event. Public Domain audio books are something we all care about, or we wouldn't be here to argue about them. :wink:
TriciaG
LibriVox Admin Team
Posts: 61051
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

#2 will indeed be fairly labor intensive, but maybe there are potential volunteers who don't wish to be readers or listeners. However, I think we would only need only about two people to actually open records to edit the subjects. I'd love to have at least one experienced librarian involved.
Only LV admins (and maybe some people at Archive itself) have access to edit the data at Archive, so this would fall entirely on the admins. Not to be a downer, but I highly doubt that's going to happen. Our primary objective is making audiobooks, not making them easier to find. :hmm:
School fiction: David Blaize
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
knotyouraveragejo
LibriVox Admin Team
Posts: 22178
Joined: November 18th, 2006, 4:37 pm

Post by knotyouraveragejo »

We record old books. They are not everyone's cup of tea. Especially the nonfiction, much of which is only of historical interest. Keep in mind, also, that relying solely on IA data for downloads leaves out many other places/ways our recordings are available - phone apps, You Tube sites and various other online services that use our recordings, to name a few. While some of these use the IA links for all their downloads, not all of them do.
Jo
TedL
Posts: 599
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

TriciaG wrote: February 20th, 2024, 12:24 pm
#2 will indeed be fairly labor intensive, but maybe there are potential volunteers who don't wish to be readers or listeners. However, I think we would only need only about two people to actually open records to edit the subjects. I'd love to have at least one experienced librarian involved.
Only LV admins (and maybe some people at Archive itself) have access to edit the data at Archive, so this would fall entirely on the admins. Not to be a downer, but I highly doubt that's going to happen. Our primary objective is making audiobooks, not making them easier to find. :hmm:
What's an Admin? Is that Management? It would be helpful if you directed the attention of all of Management to this thread, as they should have a role in this decision.

I've spent 300+ hours in the last 16 months recording nonfiction books. It has been disheartening to learn that several months after their release, these books sink into a swamp and are rarely heard again. The Internet Archive provides the "Topics" field on a book record to enable books to be found by subject. Most of the millions of modern books in their "Texts to Borrow" collection, and some other collections, have used the Topics field to make their books searchable by subject. They do this in a disciplined way by using standardized Library of Congress subject headings, rather than 'common sense' terms. We should be doing the same thing.

On my own website, Century Past Free Online Library (centurypast.org), I used these Internet Archive subject searches to provide links to subject "collections", and I also provide links to recommended individual books. 75% of users' clicks are on the subject collections. Users really like to browse through all available books on a particular subject and select their own books.

On Librivox it has apparently always been left to book coordinators to fill in the topics. This method clearly has not worked to make useful subject searches on Librivox. Standardized terms must be used on every book. We should follow the Internet Archive's lead and use Library of Congress subject headings.
TedL
Posts: 599
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

knotyouraveragejo wrote: February 20th, 2024, 12:54 pm We record old books. They are not everyone's cup of tea. Especially the nonfiction, much of which is only of historical interest. Keep in mind, also, that relying solely on IA data for downloads leaves out many other places/ways our recordings are available - phone apps, You Tube sites and various other online services that use our recordings, to name a few. While some of these use the IA links for all their downloads, not all of them do.
I listed some of the main sites that have Librivox audiobooks in their collection at the very top of this thread. I went to those sites, looked at their audiobooks collections, and used Ubersuggest to check their traffic. These sites have selected dozens of popular 'classic' fiction books for their collections, and ignore all our other audiobooks. They provide a lot of traffic for a handful of books.

A more thorough analysis could be done by anyone with a subscription to Ubersuggest, AHREF, SEMrush or similar sites, by following the backlinks to Librivox or the IA Librivox collection. I suspect the results would be the same for other sites though. It makes sense that they would only put the most popular public domain audiobooks on their sites.

Visitors who come from these private sites to Librivox for a classic book would be excellent candidates to listen to many of our other books. However, they need to be able to do subject searches to find them.
TheBanjo
Posts: 1343
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

For what it's worth, both Audible.com and Spotify carry around 15 Librivox books each that I have narrated for Librivox, as you can verify if you enter my name as a search term at either distributor's website. I haven't checked, but I imagine they similarly carry Librivox recordings by many other narrators too. They have retained my name as narrator, but have otherwise stripped off anything that would identify these audiobooks as originating from Librivox, or being in the public domain.

To be clear, I'm not complaining about this in any way — I understood that something like this could happen when I started recording audio for Librivox. I mention it only to fill out even a little further the marvellously comprehensive list you have provided. You've done a marvellous job compiling this list. It surely shows, at least in a qualitative way, what a great appetite there is for Librivox recordings
annise
LibriVox Admin Team
Posts: 38826
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

Ted
I don't understand how you decide no one has listened,
Internet Archive has a good search facility and if you sort our collection by date , recent first you get this
https://archive.org/details/librivoxaudio?sort=-date

If you look at the views on each item you will that it is 0 for at least 12 months - I got tired of scrolling down the page then.
Now I know that this is not a true count as I have downloaded them all once to my personal computer so they are not updating the counter,
The IA search can be changed - there are a list of filters and it will pick up words in PDFs and our summaries.
And people do leave reviews or set up favourites. I haven't fully checked out all the possiblities but I have fossicked around the RadioShows and series both USA and British.

Anne
Just as an example I searched the LV collection on Saracens
https://archive.org/details/librivoxaudio?tab=collection&query=saracens&sin=TXT&sort=-date
TedL
Posts: 599
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

redrun wrote: February 20th, 2024, 12:10 pm
TedL wrote: February 20th, 2024, 11:02 am Personally, I think Analytics data collection is less of a concern than it used to be. But with a little research you should be able to find that out.
It's slightly less of a concern now that they don't directly sell the information, yes. Most folks don't know how much data Google has on them, and many probably wouldn't care if they did. But some do, and some certainly would.
I'm sure the limited information GA reports to site owners is still quite useful for sites that place advertisements, or make technical decisions based on viewer statistics, but we don't really do either here. That's why I'd much rather we didn't enable this data collection any more than we have to.

Sorry for the long tangent. Back to the main topic, if we've made the recordings and shared them, we've done our primary mission. I'm all for making it easier for folks to find them after that, but there are other sites that do that, as you've noted. I'll watch to see the proposal. Thank you for your time and interest, in any event. Public Domain audio books are something we all care about, or we wouldn't be here to argue about them. :wink:
Are you aware that the number of views in Librivox increased from 300,000 two years ago to 1.7 million today? That's according to Ubersuggest. If accurate, what could be the reason for this explosion in visitors? I believe you can only figure that out by using the metrics provided to you by Google Analytics. 1.7 million views is a huge number. Are visitors coming from some other site? Ubersuggest doesn't really explain what's going on there.

And what happens to all those people when they reach the Librivox home page? Do they shuffle through a few dozen of the more recent books, or use some other method to find books of interest? Or do they just go away because we have not provided an effective way for them to find an audiobook of interest? We should try to find out what they want, and whether their visit to Librivox.org was successful.

I'd ask the same question about the many million visitors to the IA collection of audiobooks. Their 20 million views per month probably means roughly 10 million visitors. That's a stupendous number of visitors, and we should all be grateful to IA for delivering such a huge number of public domain audiobook users right to our doorstep. But what are we doing to help all these people find the books they want?

As a site owner with less than 10 thousand visitors a month, I'm envious of the number of people who visit your collections. I do everything I can to keep people on my site and help them find what interests them. Even though I don't make a dime from my site, its gratifying to know that I've helped them find something good to read without cost.
TedL
Posts: 599
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

TheBanjo wrote: February 21st, 2024, 4:28 am For what it's worth, both Audible.com and Spotify carry around 15 Librivox books each that I have narrated for Librivox, as you can verify if you enter my name as a search term at either distributor's website. I haven't checked, but I imagine they similarly carry Librivox recordings by many other narrators too. They have retained my name as narrator, but have otherwise stripped off anything that would identify these audiobooks as originating from Librivox, or being in the public domain.

To be clear, I'm not complaining about this in any way — I understood that something like this could happen when I started recording audio for Librivox. I mention it only to fill out even a little further the marvellously comprehensive list you have provided. You've done a marvellous job compiling this list. It surely shows, at least in a qualitative way, what a great appetite there is for Librivox recordings
I found your books on Audible.com (Your own title and 15 works by Joseph Conrad.) They are all listed with regular prices between $5 and $11, feature your name as the reader, and, as you say, they don't mention that they were produced by Librivox. Even the sample audio (I like your reading style!) omits the announcement that they are Librivox books. Very unethical of Amazon. However, anyone who actually listens to one of these will soon hear they are from Librivox, and many will visit Librivox.org. Its our job to then help them navigate our collection, as thousands of our books won't be on Audible.com.
TedL
Posts: 599
Joined: October 24th, 2022, 3:06 am
Location: Wisconsin
Contact:

Post by TedL »

annise wrote: February 21st, 2024, 4:34 am Ted
I don't understand how you decide no one has listened,
Internet Archive has a good search facility and if you sort our collection by date , recent first you get this
https://archive.org/details/librivoxaudio?sort=-date

If you look at the views on each item you will that it is 0 for at least 12 months - I got tired of scrolling down the page then.
Now I know that this is not a true count as I have downloaded them all once to my personal computer so they are not updating the counter,
The IA search can be changed - there are a list of filters and it will pick up words in PDFs and our summaries.
And people do leave reviews or set up favourites. I haven't fully checked out all the possiblities but I have fossicked around the RadioShows and series both USA and British.

Anne
Just as an example I searched the LV collection on Saracens
https://archive.org/details/librivoxaudio?tab=collection&query=saracens&sin=TXT&sort=-date
Some good points, and if I said no one listens, I may have exaggerated.

Browsing the list sorted by 'date published' is, I would guess, the most popular way people look through our books. That would explain why books get seen a lot for a few months after their release. The list of 19,000 books is simply too long, and most people won't see more than a couple of hundred books. If a book has been out even just 6 months, most people won't get that far by browsing. Even worse for books that were released years ago.

As to your search for saracens. That worked, and you found 7 books. I searched for Saracens, and found 9 books; 5 of them the same as yours. Then I searched for saracen (singular) and found 4 more books; none were the same as turned up in your search.

Most of the books that resulted from searches appeared because saracen appeared in the summary text, not in the topics list. This is valuable and I'm glad it does this. However, if you want search results of audiobooks that are mainly about a particular subject, we also need to be able to search from the topics (or subjects) for books that have the search term in the topics list. Then the search results aren't overwhelmed with books marginally related to a subject.

Yesterday I searched for books on cars. I tried these search terms and got different results for each: Car, car, Cars, cars, Auto, auto, Autos, autos. I didn't try vehicle or motor vehicle, and I didn't search for other terms that may have been used for autos in the 1890s-1920s, like "motor". So we need to use standardized Library of Congress terms in the Topics field. (Automobiles)
Post Reply