-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve emoji consistency with older terminals #57
Comments
As an application author I would suggest to whiners to raise an issue with their terminal (or OS) provider to get them update their product. Users should be steered away from software, esp. terminals, that do not conform to the estabilished standards. Full stop. This includes projects that lag on updating from the newly published UCD tables (what should by done by the CI script anyway). 2¢, TC :) |
@ohir You are absolutely correct, however when dealing with macOS Terminal, that is one big gorilla to deal with 😢 Be aware, that the majority of Terminals actually don't handle the Unicode 14 clarifications. There are more wrong than right terminals. That's a lot of issues to create and in the meantime, we have a library our applications use deviating from how many Terminals render. Being right isn't always the best result. |
I am aware of that sad state of affairs. But workarounds (and chores) should belong to the user of the standards implementing lib - in theirs app or a wrapper lib.
No. We have a library that properly implements standard others have got wrong. |
You make many good points. I can't disagree with anything you are saying. The idea of a correcting wrapper is definitely something worth discussing further, how could something like this be implemented? I think this is basically what I am after. |
It depends whether you're after emojis only or after a full standards display (ie. having indic scripts measured correctly on non-standards compliant input). For the former I'd make an independent emoji's BNF compliant (sub)parser to decide “suspicious” segments plus the final bitmap based filter to decide how this particular sequence will render given os/terminal/UCD version info. Plus likely a bit of corner-cases handling like black cat abomination and (apple's) directional additions. Though, if speed does not matter, simple map check you proposed can be a simpler and still valid solution. The real hard bit is to fill that map with data – m×n×v for That said full corrections wrapper, UCD based, would of course be better. But lets not pollute uniseg issues discussing a wrapper, you can reach me at gmail as ohir.ripe TC |
For accurate rendering of emojis, it is important that both the terminal and the library are consistent in determining the width of emojis. Unicode 14 clarified that the width of emoji presentation using variation selector 16 was double width. This change has created a problem.
Many terminals (such as macOS terminal, Alacritty, Hyper, VS Code) do not support the latest Unicode standard. This means they may not display newer grapheme clusters correctly. However an even bigger problem is they render older emojis different especially those using variation selector 16.
One of the guiding principles of
uniseg
is @rivo's aim for perfection.uniseg
feels like a reference implementation and helps identify problems with other implementations. But users don't care about perfection, they care about compatibility.This puts
uniseg
in a difficult place. Whileuniseg
is correct, from a compatibility perspective, it is providing different results to the majority of terminals. This creates a poor experience as the only option is to tell the developer of the terminal to upgrade their handling of Unicode. With many terminals dependent onxterm.js
this is nearly impossible for them to fix.Can we find a way to support older terminals without trying to support multiple Unicode versions?
A global option to override how variation selector 16 is handled is obviously one approach. The precedent has already been set with
EastAsianAmbiguousWidth
. It doesn't change much but would solve the biggest rendering difference. iTerm2 provided an advanced option specifically for this case. WezTerm has an option for choosing which Unicode standard is used.Another approach (which I have created a proof of concept) is the ability to override the result for specific code points. By implementing this as a global (with all the downsides as well), this allows all dependencies that rely on
uniseg
to provide the same consistent results. While my implementation is crude, it should demonstrate this idea. Benefits being that this would shift compatibility overrides to applications that useuniseg
- the tech debt stays out ofuniseg
.I'd certainly understand if this was marked as
WONTFIX
, clearly this is not a problem withuniseg
, but it is something that creates problems for Go applications that rely onuniseg
but have users on older terminals. Maintaining my fork with the hack is fairly easy, but felt an issue to discuss this might be appropriate.The text was updated successfully, but these errors were encountered: