Including reader's name in the MP3 tags

Comments about LibriVox? Suggestions to improve things? News?
Post Reply
quartertone
Posts: 295
Joined: December 27th, 2022, 2:27 pm
Location: Narnia
Contact:

Post by quartertone »

Is there a(n) (easy) way to add the reader's name somewhere in the ID3 tag in the final MP3 files prior to cataloguing?
(Possible slots might be "Composer" or "Comment")

It would be nice to be able to see who the narrator is with an easy glance, without having to go back and listen to the beginning of the track, end of the track, or go searching in the catalog if the reader decided not to speak their name.

I recognize this may generate additional work for the MCs, but it would definitely add to the listening experience. And isn't that our goal? (The making good listening experience part, not the making more work for MCs part) :D

Besides, there are programs/apps that can fill out the tags based on the filenames with some specified file name structure, so this can be (mostly) automated.

I'm happy to help if it's something i can help with.
annise
LibriVox Admin Team
Posts: 38802
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

I can't see any need and it would make changing names impossible- and it does happen. I can think of a number of cases where we have been asked to take things down and we have just changed the name.
Also what is in the tags goes out into all the world and can not be changed.

Anne
quartertone
Posts: 295
Joined: December 27th, 2022, 2:27 pm
Location: Narnia
Contact:

Post by quartertone »

id3 tags are not permanent. Or, they are just as permanent and public as the data that is displayed in the LV catalog pages. They can be updated, just need to re-upload the edited file.

Take things down as in removing a reader's association with their recordings?
But what if they had recorded their name in their reading?
annise
LibriVox Admin Team
Posts: 38802
Joined: April 3rd, 2008, 3:55 am
Location: Melbourne,Australia

Post by annise »

Anyone using our files for any purpose takes the files with our mp3 tags - what they do with them then is up to them. I would also say that a very small % of our readers do read their names - but that may depend on what I listen to and what you listen to. I've no stats to prove either of our opinions :D . Anne
redrun
LibriVox Admin Team
Posts: 3141
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

If you'd like to help with automating fixes for MP3 files, using PHP, I'd be glad to talk details elsewhere... but putting reader names in ID3 tags is probably a non-starter because it makes for a more brittle system.

ID3 tags may as well be permanent, for all the work it would take to change them on every MP3 file a reader might ever create.
This is partly because changing MP3 files in a completed project involves a lot more steps than one might assume (honestly, I'd rather cut an intro with Audacity than do some of the other required steps). Automating the process would be nice for other reasons, but still wouldn't solve the other serious problem Anne mentions: the old MP3 file is still out there in wide distribution.

That problem is part of why such care has been taken to point listeners to LibriVox.org "for more information". It is far easier to change the name once, in our own database, after which all future visitors will see what we want them to see.

Yes, web pages can be scraped, and some apps will keep using old data from our API, but most people will see the updated version of our web pages, and app-makers that bother to use our API can get updated information from it far more cheaply than by transferring MP3 files and checking tags.
barleyguy
Posts: 271
Joined: July 23rd, 2014, 1:56 pm

Post by barleyguy »

Just for consideration, there may some readers who don't read their names in the audio, that also wouldn't want their names in the ID3 tags. In other words, doing this across the board would be seen as an improvement by some people, and a defect by other people.
So that's what an invisible barrier looks like... (Time Bandits)
barleyguy
Posts: 271
Joined: July 23rd, 2014, 1:56 pm

Post by barleyguy »

Redrun, I have a question and suggestion... When a reader's name changes in the database, does their index number stay the same?

If so, one possible value for the ID3 tag preferred for reader is something like "librivox.org/reader/251". (That's the link to Mark Nelson's reader page.) If the index number stays the same when other database changes are made, that link would always be valid as long as librivox keeps their same URL format, and it if a listener used that link it would drive them to the website even if they got the MP3 somewhere else.

Just an idea,

Harley.
So that's what an invisible barrier looks like... (Time Bandits)
redrun
LibriVox Admin Team
Posts: 3141
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

I think the initial question was whether there was an easy way to do this, and... I suppose that could depend who you are. :lol:

barleyguy wrote: May 16th, 2024, 9:12 am If the index number stays the same when other database changes are made, that link would always be valid as long as librivox keeps their same URL format, and it if a listener used that link it would drive them to the website even if they got the MP3 somewhere else.
I believe the link does stay the same, so yes, the reader's URL instead of their name would be "safe" to include with the file. That's one question answered! :mrgreen:

But if we're looking at including a link, it probably shouldn't go someplace where a either name or a human-readable comment is expected. (In the Comment field, we might preface the link with something like "This reader's page at LibriVox.org : "... but we'd want that translated, for each language we support. :? )

Perhaps the "WOAR" (Webpage of Official ARtist) tag from the ID3 specification would be appropriate. We might even want to add "WOAS" (Webpage of Official Audio Source) pointing to the project's LibriVox catalog page, while we're at it. Do modern audio players ever display these links, if present? I've never seen them myself, though perhaps that doesn't signify. I don't often see "Composer" or "Comment", either.


If anyone would like to gauge the difficulty, this is the bit of code that ties things together. How do we represent a "WOAR" or a "Comment" tag to the underlying library? Myself, I'd have to read it to find out! I do know that `$section->readers` is an array, which usually has only one entry. We probably want to skip the reader tag if there are more (i.e., in Dramatic Works).
quartertone
Posts: 295
Joined: December 27th, 2022, 2:27 pm
Location: Narnia
Contact:

Post by quartertone »

Ah ok, I'm beginning to see the reasoning for keeping the reader names out of the id3 tags.

Although, I do like the idea of putting the reader id or catalog link in there somewhere.

To redrun's comment about audio players displaying tag data - i don't think it necessarily needs to be displayed by the player. I mean, it would be nice but just by being present it would provide useful metadata for anyone who might be looking.

As far as multi-reader sections, I think It would be fine to have multiple reader IDs listed in whatever field would hold the narrator data. The presence of these data could be explained in a LV wiki article for example.

If we were to insert the reader IDs into the tags, I think it would make most sense to put them in the "Comment" field.

I think something like this could work:

Code: Select all

function _build_tags($project, $section)
{
	$track = $section->section_number + $project->has_preface; //track 0 --> 1; track 1 --> 1

	$title = $this->_build_chapter_title($project, $section);

	// -------------------
	$sectionreaders = [];
	foreach ($reader as &$section->readers){
		array_push($sectionreaders, $reader->reader_id);
		// or: array_push($sectionreaders, "https://librivox.org/reader/" . $reader->reader_id);
	}
	// -------------------
	
	// populate data array
	return $tag_data = array(
		'title'	=> array($title),  //title is for chapter
		'artist'	=> array($section->author),
		'album'	=> array($project->full_title),
		//'year'	=> array('2004'),
		'genre'	=> array('speech'),
		'playtime_string' => array('0:00'),
		'track'	=> array($track),
		
	// -------------------
	'comment'	=> array($sectionreaders),
	// or
	// 'url_artist'	=> array($sectionreaders), // This plugs into "WOAR" according to the getid3 parser
	// -------------------
	
		//language??
	);

}
I made a pull request for this (my first ever!)
redrun
LibriVox Admin Team
Posts: 3141
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

Well, then we've got something to work with! Thank you for doing the hard work of digging up the proper details for this library, and testing out the code that would change it!
We are very much not guaranteeing that every proposed change gets into the code, but I'm happy to talk about this with you.

quartertone wrote: May 19th, 2024, 8:43 am As far as multi-reader sections, I think It would be fine to have multiple reader IDs listed in whatever field would hold the narrator data. The presence of these data could be explained in a LV wiki article for example.
Here's where I'm coming from, as it regards readers in Dramatic Works:
A) sometimes there are quite a few, and I don't know the limits on ID3 tags well enough to be sure we don't fall afoul of them!
B) the reader assignments aren't always in the database in ways that makes sense for this per-file context. For example, one officially acceptable way for the BC to set things up is to add every reader in the whole project to just the first section! The few projects that do this would seem to be "bugged", unless someone thought and took the time to explain this in a wiki page... and you happened to read it.

It's good to provide useful information, but I'm not always sure where the line is between useful, and misleading or confusing. The most useful places to send people for information on readers in Dramatic Works are the catalog page for the project; and the API, which has an HTML-formatted cast list in the 'description' field for the project.


More generally:

Myself, I think the "alternative version" of your code is by far the more useful one. Nobody but a programmer (or at least, someone who can bash something together on the command line) is going to have any clue what to do with a 'Comment' on an MP3 file that is... just a number, without any context. Even those tech-savvy people would have to know to go looking for a new wiki page, which would need to be filled with useful information and then linked someplace where it could be found.

As for that alternate version, it seems nice to have. If we can guarantee that it won't ever add more/longer tags than are allowed in the spec, and we can be reasonably certain it won't break things in any other way, then I'd like to see it added. To be perfectly clear though, mine is not the final say.

Edited to add: Looking at how this WOAR is labeled 'url_artist' by this library, I do see another potential pitfall. We have the author in as the 'artist', already. Some players might render this by showing the author's name as a clickable link... which would lead to the reader page! Talk about misleading! Perhaps if we had the author in all along as the 'lyricist/text-writer', we wouldn't have this added source of confusion. Too late to change that.
Maybe the reader URL as 'User defined URL link frame' would be better, but we'd both probably need to do some digging to be sure. :hmm:

If it seems like I'm just trying to come up with objections, that's to save somebody else that same job. For any code that goes beyond being a toy, and into real use by other people, we need to make sure not to introduce more problems than we solve. It also pays to ask what problem we are solving, for whom, and whether there's already another solution.
End of edit

I think it would also be nice if we had a WOAS (or maybe WPUB?) tag linking to the project itself - it could be useful to anyone who found it, and the tech-savvy could use it to find any project in our API, given just an MP3 file... any project published after we'd start adding these new tags, that is. Every good change starts somewhere!
quartertone
Posts: 295
Joined: December 27th, 2022, 2:27 pm
Location: Narnia
Contact:

Post by quartertone »

Many good points!
I sometimes (aka always) forget that most people aren't programmers or hackers, and that "good enough for me" usually means "really confusing for most people".

I see your point that the reader data is not very consistent and could easily get unwieldy (eg DR as you mentioned).

I think a reasonable distillation of this whole thing could be to put the project name & url in the tags so that at least any mp3 file taken out of context can be traced back to the catalog page, from where it might be possible to identify the reader, etc, precisely as you described:
redrun wrote: May 19th, 2024, 10:20 am I think it would also be nice if we had a WOAS (or maybe WPUB?) tag linking to the project itself - it could be useful to anyone who found it, and the tech-savvy could use it to find any project in our API, given just an MP3 file... any project published after we'd start adding these new tags, that is. Every good change starts somewhere!
I like this idea!

So then the _build_tags function would be much simpler.

The foreach loop goes away, and all we would need to do is add to the $tag_data array:

Code: Select all

"url_source" => array($project->url_librivox)
or url_publisher, though I think url_source would be more appropriate.

Side note: according to wikipedia (ID3):
A ID3v2 tag consists of a number of optional frames, each of which contains a piece of metadata up to 16 MB in size. For example, a TT2 frame may be included to contain a title. The entire tag may be as large as 256 MB, and strings may be encoded in Unicode.
redrun
LibriVox Admin Team
Posts: 3141
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

quartertone wrote: May 19th, 2024, 1:27 pm So then the _build_tags function would be much simpler.

The foreach loop goes away, and all we would need to do is add to the $tag_data array:

Code: Select all

"url_source" => array($project->url_librivox)
or url_publisher, though I think url_source would be more appropriate.

Side note: according to wikipedia (ID3):
A ID3v2 tag consists of a number of optional frames, each of which contains a piece of metadata up to 16 MB in size. For example, a TT2 frame may be included to contain a title. The entire tag may be as large as 256 MB, and strings may be encoded in Unicode.
That is much simpler! And as such, less likely to find objections. We do have the project title in as 'album' (TALB), but either of those two tags sounds like a fair fit for the project URL, depending on your viewpoint.

If you'd like to send in a new PR for this (with either tag type), then we'll have something for further discussion and testing. Please be aware, though: reviewing code changes, coming up with ways to check if they'll go wrong, and otherwise making absolutely sure someone has thought of every way it could break something... is not what anyone signed up to do with their volunteer time. I try to make it less of a chore, but do not be surprised if this still sits on the back-burner for an indeterminate time.

Side: I was looking at the v2 spec for frame ideas, but I wasn't actually sure that we were using v2 instead of v1 of ID3. Can confirm now! But v1 was not so generous on its limits as v2 is.
quartertone
Posts: 295
Joined: December 27th, 2022, 2:27 pm
Location: Narnia
Contact:

Post by quartertone »

redrun wrote: May 19th, 2024, 6:59 pm If you'd like to send in a new PR for this (with either tag type), then we'll have something for further discussion and testing. Please be aware, though: reviewing code changes, coming up with ways to check if they'll go wrong, and otherwise making absolutely sure someone has thought of every way it could break something... is not what anyone signed up to do with their volunteer time. I try to make it less of a chore, but do not be surprised if this still sits on the back-burner for an indeterminate time.

Side: I was looking at the v2 spec for frame ideas, but I wasn't actually sure that we were using v2 instead of v1 of ID3. Can confirm now! But v1 was not so generous on its limits as v2 is.
Will do!
And yes, I also did check to make sure that it was the ID3v2 we are currently using. :D
Post Reply