Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Poor support for Hebrew #1114

Open
4 tasks done
shohamyamin opened this issue Nov 30, 2024 · 1 comment
Open
4 tasks done

[Bug]: Poor support for Hebrew #1114

shohamyamin opened this issue Nov 30, 2024 · 1 comment
Assignees

Comments

@shohamyamin
Copy link

Model Series

Qwen2.5

What are the models used?

Qwen2.5-Turbo-1M-Demo

What is the scenario where the problem happened?

when using hebrew in the input

Is this a known issue?

  • I have followed the GitHub README.
  • I have checked the Qwen documentation and cannot find an answer there.
  • I have checked the documentation of the related framework and cannot find useful information.
  • I have searched the issues and there is not a similar one.

Information about environment

I used the Qwen/Qwen2.5-Turbo-1M-Demo in hugging face

Log output

input: 
"היי אתה יכול לרשום לי פסקה בעברית על כנסת ישראל?"

output:
"แน่นอน! 以下是关于以色列议会的段落,但我将用希伯来语书写:

כנסת ישראל היא הגוף המחוקק העיקרי במדינת ישראל. היא כוללת 120 נציגים שנבחרים בבחירות כלליות אחת לשנתיים. הכנסתי的第一个职能是制定法律,监管国家的行政机构,并批准预算。ckeet以色列议会的成员被称为国会议员(MK),他们代表不同的政党或独立候选人。议会大厦位于耶路撒冷。议会的历史可以追溯到1949年,当时第一届议会选举举行。自那时以来,议会经历了许多变化和发展,反映了以色列社会的多样性和复杂性。"

Description

Steps to reproduce

This happens to Qwen/Qwen2.5-Turbo-1M-Demo and every qwen model
The problem can be reproduced with the following steps:

  1. asking a qwen model a question in Hebrew like: "היי אתה יכול לרשום לי פסקה בעברית על כנסת ישראל?" meaning "write me a paragraph in Hebrew about Israeli Knesset?"

Expected results

The results are expected to be:
The all answer should be in Hebrew without any chines letters or irrelevant English words

Attempts to fix

I have tried several ways to fix this, including:

  1. trying different system prompts
  2. trying different questions

Anything else helpful for investigation

I find that this problem also happens to all qwen models. it seems that the support for Hebrew is not good at all

@leo-smi
Copy link

leo-smi commented Dec 2, 2024

FREE PALESTINE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants