Advanced search of Librivox site by keywords (exact match) yields anomalous result

Comments about LibriVox? Suggestions to improve things? News?
redrun
LibriVox Admin Team
Posts: 2940
Joined: August 11th, 2022, 8:32 pm
Contact:

Post by redrun »

Yes, we've had several volunteers working to improve it here and there. Some are changes to the site's behavior as seen by visitors, and others are more "make it easier to change this thing without breaking it". I believe the last search update was the latter, as we've had a good few of those lately. :D

Those changes are much easier to test, and make sure they don't do something unexpected.
I'll be out for a bit on this last weekend of April, but still checking in as I get the chance. I will try to follow up on Monday, with anything I can't do on the go.
TheBanjo
Posts: 1307
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

Good thing I'm not working on this code, then, because I can already see I made a mistake. The big culprit is actually this line:

Code: Select all

$keywords = explode(' ', $params['keywords']); 
That, combined with the JOIN on each key term, are what's causing a search for keyterm "psychological fiction" to end up as a search for audiobooks with the key term "psychological" OR the key term "fiction".

Given that redrun has expressed interest in using LOC key terms in the future (many of which comprise multiple words, as in my example), and I would like to be able to continue to submit such terms too for new projects, I think I should, in fact, persist in asking for the current search behaviour to be altered so as to allow meaningful searching against a multi-term key term. What is the proper process for me to follow at this point? For example, should I raise a Github issue, or should I assume that one or more administrators keeping an eye on this discussion will confer as to whether to add such a change to the work queue, and pass on the request for a change, if they think that's a good idea, internally?
TriciaG
LibriVox Admin Team
Posts: 60808
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG »

Here are the issues on Github with "Search" in them: https://github.com/LibriVox/librivox-catalog/issues?q=is%3Aissue+is%3Aopen+search

Ironically, one of them is asking for "or" searches in addition to "and" searches. Probably the best course of action would be to comment in that issue with the code and findings you've made so far.
School fiction: David Blaize
America Exploration: The First Four Voyages of Amerigo Vespucci
Serial novel: The Wandering Jew
Medieval England meets Civil War Americans: Centuries Apart
TheBanjo
Posts: 1307
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

Thanks for your advice, TriciaG. I've done that just now (https://github.com/LibriVox/librivox-catalog/issues/53).

I'm not very familiar with the world of GitHub. I'm hoping the fact that I've made this comment may open the possibility that the priority of this issue gets re-evaluated, though I'm not sure if this is actually how things work. Fingers crossed. If I'm right, all that has to happen in the code is to change a space character into a comma character. And then, yes, of course, test, test, test.
knotyouraveragejo
LibriVox Admin Team
Posts: 22131
Joined: November 18th, 2006, 4:37 pm

Post by knotyouraveragejo »

The programmer who worked for us on the Mellon project did not have an extensive background in search functionality - the advanced search was working at one point on an earlier version of the one that went live without the advanced search fully coded. It works, but with limited functionality. This was also the reason for the collections displaying every item in the collection rather than just the one that matches the search. Some day... :wink:
Jo
InTheDesert
Posts: 7783
Joined: August 20th, 2019, 8:25 pm

Post by InTheDesert »

TheBanjo wrote: February 29th, 2024, 2:45 pm Good thing I'm not working on this code, then, because I can already see I made a mistake. The big culprit is actually this line:

Code: Select all

$keywords = explode(' ', $params['keywords']); 
That, combined with the JOIN on each key term, are what's causing a search for keyterm "psychological fiction" to end up as a search for audiobooks with the key term "psychological" OR the key term "fiction".
Why not submit a pull request for this? You're right about it.
Female Scripture Characters by William Jay (1769 - 1853) 97% 1 left! "The Penitent Sinner Part 2"
St. Augustine (Vol.6 Psalms 126-150) 94% 3 left!
PL pls: DPL 43 27-28
TheBanjo
Posts: 1307
Joined: January 23rd, 2021, 8:19 pm
Location: Melbourne, Australia
Contact:

Post by TheBanjo »

InTheDesert wrote: March 15th, 2024, 4:03 pm Why not submit a pull request for this? You're right about it.
Thank you.
I'm quite new at working on open source software, and never even used git in earnest before a couple of weeks ago.
Right now I'm working on another area of catalog functionality, and once I've got what I'm thinking of working locally, will send some details of what I'm proposing to the Librivox admins to see if it's something they're interested in adopting, at least in principle, before submitting a code level change request.
After that, I'm hoping to have a closer look at the whole of the Advanced Search function, which appears to me to work rather oddly and poorly at the moment (although I've not done a thorough analysis yet of its weaknesses). When I get to that, I'll certainly be looking to fix this bug I've noted here.
For what it's worth, I'd add that while there is a bug in the way keyword search works right now, even fixing it with that one line change isn't actually terribly helpful if no-one has a clue what kinds of keywords are actually associated with Librivox audiobooks. It's addressing that that I'm looking at right now.
Post Reply