You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In com.crtomirmajer.wmd4j.WordMovers, the weighting is wrong when stopwords are set. A simple scenario is to use a set of stopwords that never happen, and check the distances computed.
It decreases the distance value even when tokenA and tokenB are not stop words. stopwordWeight should be multiplied when one or both are. A quick fix is to change line 69 to:
stopwords.contains(tokenA) || stopwords.contains(tokenB) ? stopwordWeight : 1
The text was updated successfully, but these errors were encountered:
In com.crtomirmajer.wmd4j.WordMovers, the weighting is wrong when stopwords are set. A simple scenario is to use a set of stopwords that never happen, and check the distances computed.
It decreases the distance value even when tokenA and tokenB are not stop words. stopwordWeight should be multiplied when one or both are. A quick fix is to change line 69 to:
stopwords.contains(tokenA) || stopwords.contains(tokenB) ? stopwordWeight : 1
The text was updated successfully, but these errors were encountered: