Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use softmax function instead of min-max normalization #99

Merged
merged 2 commits into from
Dec 29, 2022

Conversation

Alex-Kopylov
Copy link
Contributor

What do you think about passing results to softmax function instead min-max normalization? I think it's more clear way. Because, for example, you can have a threshold to filter-out unidentified languages.

Is there are some pitfalls that aren't clear for me? I've implemented this by slightly changing your code. I've also rounded results.

It passed black and mypy, but not tests. It's throwing me error like:
INTERNALERROR> UnicodeEncodeError: 'charmap' codec can't encode characters in position 712-720: character maps to <undefined>

@pemistahl
Copy link
Owner

Hi @Alex-Kopylov, thank you for your pull request. I thought that min-max normalization would be a reasonable choice but it is certainly possible that there is a better normalization method which I have not tried yet.

Why have you closed your PR already? The failing unit tests should be easy to fix, as far as I can see in the CI pipeline. I'm going to reopen the PR now and check whether the softmax normalization is a better fit for the confidence values.

Thanks again for your contribution. I appreciate this a lot. :)

@pemistahl pemistahl reopened this Dec 29, 2022
@Alex-Kopylov
Copy link
Contributor Author

I closed it accidentally. Glad to hear that you're taking these changes into account. I'm going to play with different approaches more and will inform you if there will be something interesting.

@pemistahl pemistahl added this to the Lingua 1.3.0 milestone Dec 29, 2022
@pemistahl pemistahl merged commit 3b9b57c into pemistahl:main Dec 29, 2022
@pemistahl pemistahl changed the title Softmax instead min-max normalization Use softmax function instead of min-max normalization Dec 29, 2022
@Alex-Kopylov Alex-Kopylov deleted the softmax branch December 30, 2022 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants