You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But ExactSearch just seems to try and match a single word with a single word. It doesn't match a whole string "only" in the middle with a word boundary, like the original problem reported.
I.e with ExactSearch "abc" it will NOT match at all "abcde abc zabc", but will match if the string is "abc" (so it's basically acting like a Map)
But with MultiPatternSearch abc will match 3 times.
It would be good to have an option where it can match inside an arbitrary long string, but only at word boundaries either side (eg if there is whitespace or end of line next to the match). I'd be happy to add a specific boundary character between words if it helps.
Hope that makes sense!
The text was updated successfully, but these errors were encountered:
Just to give an idea of a hacky test that gets me closer, in the middle of MultiPatternSearch if I do...
// func (m *Machine) MultiPatternSearch(content []rune, returnImmediately bool) [](*Term) {
// ...start of func
// .. for _, word := range val {
// ...then add this inside the loop
// if previous word char is a whitespace and we are at the end of the string, and the char after the word is whitespace
if ( content[ pos - len(word) ] < 34 ) && ( (pos+1 < contentLength && content[pos+1] < 34) || pos+1 == contentLength ) {
term := new(Term)
term.Pos = pos - len(word) + 1
term.Word = word
terms = append(terms, term)
if returnImmediately {
return terms
}
}
It naturally won't work for other none simple ascii languages, and would need a switch in the func to decide whether to use it not, but it's the sort of thing I was meaning maybe.
Hi, I have seen the issue at #4
But ExactSearch just seems to try and match a single word with a single word. It doesn't match a whole string "only" in the middle with a word boundary, like the original problem reported.
I.e with ExactSearch "abc" it will NOT match at all "abcde abc zabc", but will match if the string is "abc" (so it's basically acting like a Map)
But with MultiPatternSearch abc will match 3 times.
It would be good to have an option where it can match inside an arbitrary long string, but only at word boundaries either side (eg if there is whitespace or end of line next to the match). I'd be happy to add a specific boundary character between words if it helps.
Hope that makes sense!
The text was updated successfully, but these errors were encountered: