Vocabulary dictionary

Kanji dictionary

Grammar dictionary

Sentence lookup

test
 

Forums - Upcoming change - duplicate removal (volunteers needed)

Top > renshuu.org > Feature Requests/Improvements

Page: 1 of 2



avatar
マイコー
Level: 304

Due to some data decisions that were made 15+ years ago, renshuu has had the issue of having multiple versions of the same term floating around in different lists. For example, ごはん and . (both meaning breakfast, or あさごはん). A lot of work has been done to clean this up so that only the most common version is present in renshuu's materials, but there are still a number of cases where more than one is present.

Additionally, even for cases where it has already been cleaned up, a user might have already studied more than one of them, so they have what seem like duplicates in their word lists.

However, this will require actual changes in existing user data, which is something that most changes in renshuu do not cause. Because of that, I want to run a test round with a small number of users first and make sure it works.


So, looking for volunteers! These are the steps I will take for the volunteers:

  1. Isolate the main term among duplicates,
  2. Maximize the mastery levels on that term, pulling from the mastery levels of the duplicates.,
  3. Hiding (but not removing..yet) the duplicates in their account so they do not appear in studying. They WILL, however, still appear in lists (just as hidden).,
  4. After 1-3 are confirmed, I will suppress the duplicates in all schedules, and add the primary term in all schedules that do not yet have it present.,

This will effectively remove them. If this makeshift step works, then I can roll it out to everyone. At this point, I will actually remove the duplicates from the original schedules/lists (replacing them with the primary term), and remove the duplicates from your local schedules. The reason I have to wait for removing the duplicates from your schedules is that if I do so before this step, they will automatically reacquire the duplicate terms from the original source lists.



Please let me know if you are interested! I only need users who know that they have multiple copies of some words in their lists (whether it is a single schedule, their overall user account, or anything else.). So if you are a relatively new user, this most likely does not apply to you.

14
18 hours ago
Report Content
avatar
BakuDekuchan
Level: 991

I'll be happy to help out! :)

0
18 hours ago
Report Content
avatar
theknife
Level: 3

I'd love to have this done to me!

0
17 hours ago
Report Content
avatar
g3kkou
Level: 89

I'm interested in helping out with this!

0
17 hours ago
Report Content
avatar
reddeath68
Level: 496

My account has 1,761 words currently, although idk what dupes there are, but there are likely a few as I manually added some words before they later came up in lessons. I am already beta testing the android app and opted in to other beta tests. I can be one of the testers for this as well.

1
17 hours ago
Report Content
avatar
WildAtelier
Level: 1042
79704fcbaf5a11d47276e1da.jpg

Does this mean that the option to add a different written version will be taken out to only have one version as well? I prefer to add the version that I come across while immersing, and I also prefer to add the version that includes all the kanji. So I have really liked being able to choose the version to add to my schedules. Will this still be available after all the consolidation?

4
16 hours ago
Report Content
avatar

I'll be happy to help out! :)

dido:}kao_heart.png

0
16 hours ago
Report Content
avatar
ぶどうの
Level: 457

really happy to hear about the duplicates issue being worked on, and happy to volunteer my account for testing!

0
15 hours ago
Report Content
avatar
エミーム
Level: 267

I’d be happy to help with this! I think I’ve noticed some duplicate words in my schedules. I’ve definitely noticed some onomatopoeia words were added in both katana and hiragana, would those be considered duplicates as well?

1
15 hours ago
Report Content
avatar
エマ・キ
Level: 460

I could volunteer! I have a few duplicates and some possible edge cases

- frequently I have hidden all but one of the duplicate words, so whichever you select as the primary version should only be hidden iff all of the duplicates were hidden

- たばこ and タバコ are duplicates to me, but maybe not to others

and it seems like step 4 should not add the word to lists that contained none of the duplicates?

0
14 hours ago
Report Content
avatar
トンヤ
Level: 364

I would like to help as well, but I'm not sure if I have any duplicates left.

For a while しい read as かたくるしい and かたぐるしい came up at the same time in quizzes and I always got the reading wrong, until I removed one of them from my schedules. I haven't really had any other terms that drove me crazy like that. But there might still be duplicates that at least don't show up both on the same day.

0
14 hours ago
Report Content
avatar
マイコー
Level: 304
79704fcbaf5a11d47276e1da.jpg

Does this mean that the option to add a different written version will be taken out to only have one version as well? I prefer to add the version that I come across while immersing, and I also prefer to add the version that includes all the kanji. So I have really liked being able to choose the version to add to my schedules. Will this still be available after all the consolidation?

That's a good question. At the moment, the way the conversion system is set up is that if that pair happens to be split across renshuu materials, they will be consolidated both in the materials, and in your mastery data.

However, if they are not in the renshuu materials, the the system will assume that you added them, and will not touch them.

My intention is only to fix the renshuu-maintained materials, and any presumed duplication that came from that.

It might be worthwhile to have a preliminary report sent out to each user, and then a "would you like to merge these?" - that way, the final step is in the hands of the user.

Regardless of what the user chooses, though, the renshuu materials themselves will be fixed.


しい read as かたくるしい and かたぐるしい <-- those would not be considered duplicates. They are linked together in the dictionary display, but are considered separate.

The only duplicates are ones with the SAME underlying reading, but different kanji layout (and in 90% of cases, it's not different kanji, but rather, the presence or absence of a kanji, like the あさごはん example above)

4
13 hours ago
Report Content
avatar
カレン
Level: 347

I’d be happy to help out too! I’ve noticed there are tons of duplicates when building a schedule from advanced search.

0
13 hours ago
Report Content
avatar
マイコー
Level: 304

Advanced search will not necessarily filter those out, so I'd like to see some examples of what you're seeing. The "duplicates" in the dictionary are correct and accurate - the issue I am addressing here is renshuu-maintained lists (which are 90% of what users study on renshuu, I'm guessing) using more than one version, which clogs up the lists and makes studying (slightly, but more than zero) less effective.

1
13 hours ago
Report Content
avatar
むじな
Level: 502

I'm not sure if I've seen any duplicates, but I'd be glad to be a モルモット.

0
12 hours ago
Report Content
avatar
カレン
Level: 347

Advanced search will not necessarily filter those out, so I'd like to see some examples of what you're seeing. The "duplicates" in the dictionary are correct and accurate - the issue I am addressing here is renshuu-maintained lists (which are 90% of what users study on renshuu, I'm guessing) using more than one version, which clogs up the lists and makes studying (slightly, but more than zero) less effective.

For example for the term こけつにいらずんばこじをえず

2629db498e997d344f1b7465.png
0
12 hours ago
Report Content
avatar
Seycalia
Level: 431

I remember asking about this! I'd love to help and get my terms cleaned up :)

0
12 hours ago
Report Content
avatar
Level: 102

I’d be happy to volunteer as well.

0
11 hours ago
Report Content
avatar

I would love to assist with this.

0
11 hours ago
Report Content
avatar
ゼズリエル
Level: 153

I also want to help.. ^^

0
10 hours ago
Report Content
Getting the posts


Page: 1 of 2



Top > renshuu.org > Feature Requests/Improvements


Loading the list
Lv.

Sorry, there was an error on renshuu! If it's OK, please describe what you were doing. This will help us fix the issue.

Characters to show:





Use your mouse or finger to write characters in the box.
■ Katakana ■ Hiragana