掲示板 Forums - Dictionary search results improvement (English)
Top > renshuu.org > Feature Requests/Improvements Getting the posts
Top > renshuu.org > Feature Requests/Improvements
It would be useful if the English search results got an improvement and be more comprehensive.
If for example I search for "ingredient" in the dictionary, I can't find 材料 (ざいりょう) because it has "ingredients" in its definition. I would expect to find all the entries, regardless of singular vs plural form (although there may be nouns which seem to be the plural form of other nouns but actually have different meanings). Also, if I search for "hot pot" I don't get the results including "hotpot" or "hot-pot" in their definitions.
I don't know exactly how this could be implemented. The easiest thing that comes to my mind is using a database of English terms which links nouns with their singular and plural form, and compound terms like "hot pot" with "hot-pot" and "hotpot".
I do agree, this would be an excellent improvement. As to adding it, though...
renshuu's word dictionary currently uses a search engine called Sphinx Search. I looked into it a bit to see what functionality is available for handling this within the search system (I could roll my own search (as I did many years ago), but it was *much much much* slower), but my initial search results did not come up with anything that seems to work. It may need more research to be done, but coming up with my own database of various word forms would probably not be ideal.
I do not think I can work on this in the near-future, but I'm sticking it in my to-do list so I can hopefully improve this in the future.
A quick-ish solution on the backend would be to hit your search engine twice, but appending an 's' or 'es' to the word, and just append the results before any sorting is applied.
Won't be perfect, but shouldn't significantly impact speed. You could get a mapping of singular English words to their plural versions to be much more exact, but you'd have to find one 😉
The traditional approach is to stem the words as they are indexed, as well as when they are searched.
Sphinx Search’s version of this appears to be called morph mapping. It even has a thing called morphdict that lets you override the default stemming rules.
octopii => octopus business => business businesses => business
I just noticed that morphdict was introduced with version 3.4 in 2021, so depending on how old renshuu’s version is, it might be necessary to update first.
I actually did try stemming the words on input, but it did not work for me (in my limited testing), and it's somewhat of a black box feeding in queries and getting the results back, so I was not (at the time) able to look further into why it wasn't working.