-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nickname surrounded by single quotes gets taken as the middle name. #74
Comments
Currently the parser recognizes nicknames that are surrounded in parenthesis or double quotes. This is controlled by a regex, defined here:
And the parse_nicknames() method runs it: python-nameparser/nameparser/parser.py Lines 385 to 395 in fb1475a
You could update the regex to look for single quotes by replace that regex, something like:
I think the only reason I didn't include single quotes is because I wasn't sure how to write the regex so that the second match only matches the same character that it found in the first match. Also there's some weird edge case names like "ab'ad al am'an" that I wasn't sure how to weed out. My regex chops are not very strong. If you come up with a better regex, I'd be happy to put it in the library. |
What about using groups? https://gist.github.com/bpeterso2000/11277541 |
Thanks @boxabirds for the tip. That put me on the right track. Once I dug into it I realized that the 3 different ways to indicate a nickname had slightly different rules around them. Single quotes cannot contains white space, but double quotes and parenthesis can. Double quote matches the same character twice but parenthesis does not. So I ended up splitting them up into 3 different regexes and not needing to use groups in that way after all. I appreciate the prodding though because that's the way it should work and now it will be fixed in v1.0. |
Is it not the general notion to surround a nickname in quotes (and put it before the surname)? Or is this possibly a limit of the nameparser?
Here are name examples (not real people):
The text was updated successfully, but these errors were encountered: