-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typing for multi-dimensional arrays #513
Comments
It looks like the proposal of integer generics is also relevant here python/mypy#3345 (it looks almost identical to what you call In general, I am very supportive of this project (I have heard many times that static typing would be very helpful for data science, numerics and related fields, but current support in mypy and PEP 484 is very limited). The main obstacle however is the size of this project (it may require its own PEP). I will read your document (thanks for writing it), but already now it seems to me that it may make sense to start from features that will be useful in general (i.e. also outside of numeric stack) such as literal types and variadic generics. Also tagging @JukkaL here just in case. |
Yes, I expect a PEP will be necessary, especially if we want to standardize base types for typing multi-dimensional arrays in the typing module.
Indeed, this is probably the best place where the broader typing community can help. |
I've opened a sub-issue for discussing syntax for array typing: #516 |
Some update on the issue: Our (mypy core team) previous schedule for working on this was Q4 2018. However, we decided that some type system features (such as literal types and variadic generics) needed to efficiently support NumPy will be also useful in general, so we decided to implement the general support for such features first. Literal types are almost already there, and variadic generics are going to be added in coming months. After that we will start working on dedicated NumPy support (around Q2), sorry for a delay. |
Sorry, I forgot to post notes from the latest Python typing meetup on numeric stack typing here. Here they are |
Are you specifically looking at numpy, or at the machine learning echosystem with numpy/pytorch/... ?
Pytorch doesn't seems to do auto cast when types are different whereas Numpy is doing some upcast (see https://stackoverflow.com/questions/56022497/numpy-pytorch-dtype-conversion-compatibility/56022918?noredirect=1#comment98689941_56022918) |
At all of them. Dimensionality/shape will be an additional abstraction orthogonal to container type and element type. |
Sorry I wasn't clear, I wanted to ask for the numerical stack part specifically. Do we have a current target in numpy / pytorch / tensorflow that would focus most of the effort are are people looking to their favorite flavor (which seems incompatible with each other) |
There are two separate big things required to support numerical libraries:
In the first one we ideally want to be as broad as possible, I think there are no particular "preferences". While in the second, I think we should probably start with numpy, since it is the common dernominator for many other libraries. |
@ilevkivskyi do you have any suggestions for how to track progress on (or, even better, contribute to) the development of these "numeric stack typing" features? Full support for the features described in your linked notes on numeric stack typing would be incredibly useful! |
@dmontagu The best way is to just follow this issue, also you can subscribe to |
Hey! I'm a student working on a thesis and I am very interested in contributing to this project as part of my research! Mainly, I want to statically check dimensionality alignment in numpy operations. Let me know how I can help out. |
@theodoretliu Hi! It is great to hear you are interested. Just to get a bit more info, how much time will you be able to spend on this? The best course of action is probably to implement support for relevant type system features in one of the mainstream Python type checkers. I would of course propose mypy :-) as one of its maintainers, see https://github.com/python/mypy If this sounds right to you, I can give you a more detailed plan and some guidance. |
I'd be willing to dedicate pretty significant time in the coming months. And yes, that sounds like a great course of action! |
Be sure to talk to Mark Mandoza to have input from our experience doing so
in Pyre :D
Le mer. 13 nov. 2019 à 17:12, Theodore Liu <[email protected]> a
écrit :
… I'd be willing to dedicate pretty significant time in the coming months.
And yes, that sounds like a great course of action!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#513?email_source=notifications&email_token=ABWLNQHD7HHN6QRELANYVSLQTQRP3A5CNFSM4EHJIID2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED6VK2A#issuecomment-553473384>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABWLNQFYX3G7RQZDABAIVBLQTQRP3ANCNFSM4EHJIIDQ>
.
|
Mark Mendoza* ... my finger are a bit dumb today, sorry.
Le mer. 13 nov. 2019 à 17:14, Vincent Siles <[email protected]>
a écrit :
… Be sure to talk to Mark Mandoza to have input from our experience doing so
in Pyre :D
Le mer. 13 nov. 2019 à 17:12, Theodore Liu ***@***.***> a
écrit :
> I'd be willing to dedicate pretty significant time in the coming months.
> And yes, that sounds like a great course of action!
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#513?email_source=notifications&email_token=ABWLNQHD7HHN6QRELANYVSLQTQRP3A5CNFSM4EHJIID2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED6VK2A#issuecomment-553473384>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABWLNQFYX3G7RQZDABAIVBLQTQRP3ANCNFSM4EHJIIDQ>
> .
>
|
A group of us at DeepMind are interested on working on this too. We've set up a mailing list at https://groups.google.com/g/python-shape-checkers to try and bring together all the conversations about this into one place. I've posted a summary there of what seems to be the current state of things, but stay tuned for updates! |
Hi @mrahtz, Thanks for the initiative! Indeed there are currently a lot of ongoing efforts in this directions. At Facebook we are currently working directly on this, and already support several use cases with Pyre, with support for variadic syntax, which has been polished with respect to the initial proposal at Python Typing Summit. However, it would be very beneficial to get first hand information of the state of each team that is working on this, since so far I have read about people working on that in Dropbox, Facebook, Google and now Deepmind. Also, please don't miss the Python Typing mailing list. |
You wanted this annotation: class float64: # Custom annotation class
def __getitem__(self, item):
# Some value should be set to identify that float64[:], float64[:,:] or etc.
return self
float64 = float64()
def for_loop(n: float64[:,:]):
pass Take it ;) |
To solve this issue, using "Annotated[]" would be efficient to declare the type already. However to get the proper type and "static" type checking on "Annotated[]" we need support on mypy/pyanalyze etc. To annotate and infer type with arithmetic from function calls like "np.reshape" we need to use code to define custom rules (not just PEP484) to analyze proper types. I doubt there are few supports on custom "Annotated[]" types, not easy for user to define and statically check their own "Annotated[]" types, which probably is the solution to all kinds of dynamic types in python, enabling symbolic execution of arbitrary python code. |
I'd like to open a discussion about typing for multi-dimensional arrays in general, and more specifically for NumPy. We have already been discussing this over in the NumPy issue tracker (numpy/numpy#7370) and recently opened a new repository to start writing type stubs (https://github.com/numpy/numpy_stubs).
To help guide discussion, I wrote a document outlining ideas for array shape typing.
To summarize:
float64
) and shapes (e.g., a 3x4 array) for multi-dimensional arrays.(N, M)
to shape(N,)
for arbitrary integersN
andM
. These dimension variables look very similar toTypeVar
, ifTypeVar
supported integers as types.(...., N)
for an array with a last dimension of lengthN
and any number of proceeding dimensions. There are particular rules (broadcasting) that should be enforced for matching multiple arguments with variable numbers of dimensions.This will likely require some new typing features (as well as type-checker support). Notably:
array.sum(axis=0)
.NDArray[N]
andNDArray[N, M]
.DimensionVar
as described in my doc).The text was updated successfully, but these errors were encountered: