Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leading - should not get converted to underscores. #1635

Closed
gilch opened this issue Jun 9, 2018 · 5 comments · Fixed by #2011
Closed

Leading - should not get converted to underscores. #1635

gilch opened this issue Jun 9, 2018 · 5 comments · Fixed by #2011

Comments

@gilch
Copy link
Member

gilch commented Jun 9, 2018

If you define a converter function with a name like

(defn ->foo [] ...)

Then it becomes module-private and you can't import it with *. (Unless you override this with __all__.) This is a gotcha that Hy doesn't need.

I propose we revise the mangling rules to distinguish leading -s from leading _s. Any internal - will continue to mangle to _ as now, but if they're in the lead, it becomes XhyphenHminusXs, or preferably something shorter like XsubXs if we do the short names #1577. But leading _ won't change.

We could also distinguish non-leading _s from -s for full reversibility by converting them to Xlow_lineX (or XscoreX).

Dunder names would look weird if we had to write them like __init--, so trailing _s will use the same rule as for the leading _s. Then we'd write __init__ etc.

Examples,

=> (mangle "->>")
'hyx_XsubXXgtXgtX'
=> (mangle "->foo")
'hyx_XsubXXgtXfoo'
=> (mangle "foo->bar")
'hyx_foo_XgtXbar'
=> (mangle "_spam-eggs")
'_spam_eggs'
=> (mangle "_spam_eggs")
'_hyx_spamXscoreXeggs'
=> (mangle "__init__")
'__init__'
=> (mangle "--init--")
'hyx_XsubXXsubXinitXsubXXsubX'
=> (mangle "_-_")
'_hyx_XsubX_
=> (mangle "-_-")
'hyx_XsubXXscoreXXsubX'
=> (mangle "__")
'__'

In summary, special characters get mangled to X-quoted forms except for leading _s, trailing _s, and internal -, which all convert to _.

@Kodiologist
Copy link
Member

I have to admit that I never considered the possiblity of beginning a name with ->; when I've seen that digraph in Lisp names before, it's been in the middle of the name.

I propose we revise the mangling rules to distinguish leading -s from leading _s. Any internal - will continue to mangle to _ as now, but if they're in the lead, it becomes XhyphenHminusXs, or preferably something shorter like XsubXs if we do the short names #1577. But leading _ won't change.

That sounds reasonable.

We could also distinguish non-leading _s from -s for full reversibility by converting them to Xlow_lineX (or XscoreX).

But in that case, the user is no longer allowed to write a Python name that contains an internal underscore, like string.ascii_lowercase, with a real underscore. The rule that any Python name is also a valid Hy name with the same meaning will be broken, and mangle will no longer be idempotent. The same goes for #1634 (comment).

@gilch
Copy link
Member Author

gilch commented Jun 10, 2018

The rule that any Python name is also a valid Hy name with the same meaning will be broken

An important point, which makes the leading underscore problem something of a separate question from this. The tradeoff is that we don't get full invertiblity. I'd prefer there be one-- and preferably only one --obvious way to do it. That is--always use hyphens internally. Always write the dunder names the same way, like __init__, not --init--. And while we're at it, always mangle ? to XqueryX or something, instead of is_, since Python isn't consistent anyway.

no longer be idempotent

I don't think idempotent functions can be invertible generally (except some trivial cases, like identity or singleton sets), since mapping the result back to itself when it started as something else means it can't be 1-to-1. We can't have both. So which property is more important? I thought it was invertiblity, since we're defining and using an inverse function, unmangle. Why is idempotence better for this use case?

@Kodiologist
Copy link
Member

So which property is more important? I thought it was invertiblity, since we're defining and using an inverse function, unmangle.

unmangle isn't a real inverse, because mangle is many-to-one. Its purpose is to produce a pretty "Hy-like" name for the input, with "hyx_" prefixes gone and "XfooX" replaced with the character it indicates etc.

Why is idempotence better for this use case?

When I think about it, "idempotence" in this case is just a fancy way to say the earlier thing I said, "any Python name is also a valid Hy name with the same meaning"; i.e., every Python identifier mangles to itself. This property is equivalent to idempotence because mangle is also guaranteed to always return a Python identifier.

So to answer your question, I would rather have Python names work unchanged as Hy names than have unmangle be a real inverse.

@scauligi
Copy link
Member

Welp, this is pretty much exactly what I proposed with #2005, I should have done my due diligence before creating a new issue.

+1 to mangle only affecting invalid python identifiers, I can go ahead and draft up a PR.

@allison-casey
Copy link
Contributor

I went through and categorized all the issues and made a bit of an index for myself. I'll post it in discussions after i have the chance to clean it up some.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants