Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM updates & green commit tracking #1624

Open
sstamenova opened this issue Aug 22, 2022 · 14 comments
Open

LLVM updates & green commit tracking #1624

sstamenova opened this issue Aug 22, 2022 · 14 comments

Comments

@sstamenova
Copy link
Collaborator

There's an ongoing experiment to keep mhlo, torch-mlir and onnx-mlir in sync with respect to their llvm commit. These projects historically have different update cadences and rarely converge on a single commit. The current effort is to make sure that they converge at least sometimes with mhlo and torch-mlir having weekly updates and onnx-mlir having roughly monthly updates (though more frequent is possible for each of the projects).

The commit tracking is happening here: llvm/torch-mlir#1178 and the "green" commit is guaranteed to have a green build and lit tests for llvm, lld, lldb and mlir on Ubuntu and Windows, though we can expand the testing to including more platforms if necessary.

The latest update to the llvm commit in onnx-mlir put it in sync with torch-mlir, further updates should attempt to do the same and it would be beneficial for everyone if anyone working on an update tries to pick up a green commit from the experiment and comments here to let others know that an update is in progress.

@sstamenova sstamenova pinned this issue Aug 22, 2022
@sstamenova
Copy link
Collaborator Author

@AlexandreEichenberger

@tungld
Copy link
Collaborator

tungld commented Aug 23, 2022

PR #1628 updates LLVM with this green commits:

Week of 8/22/2022:
Green LLVM commit: e5d5146323ffaa13eb5185616c6ae5c36b69352d
Green MHLO commit: ace4030dd55fce2a74e46f71f1937feb61ed1e3f (branch greencommit/2022-08-22-e5d51463)

@AlexandreEichenberger
Copy link
Collaborator

PR #1628 merged

@imaihal
Copy link
Collaborator

imaihal commented Aug 24, 2022

@AlexandreEichenberger @tungld

It seems llvm/llvm-project@e5d5146 fails some installation tests (cmake --build . --target check-mlir) on Linux s390x. This seems to be fixed by llvm/llvm-project@df4e637 I confirmed that on buildbot and my own environment. The error is the same with this build bot results

These errors are not critical for onnx-mlir, but it might be better to update the commit again in some timing.

@tungld
Copy link
Collaborator

tungld commented Aug 24, 2022

This seems to be fixed by llvm/llvm-project@df4e637

Does this really fix the issue on s390x? I saw it just marked them as UNSUPPORTED and the comment says: // Bytecode currently does not support big-endian platforms

@ljfitz
Copy link
Contributor

ljfitz commented Dec 6, 2022

We're also finding the green commit approach useful and wonder if therre has been discussion on a rotation process between onnx-mlir fans to take turns updating onnx-mlir to green llvm-project commits? If so, we'd like to contribute.

We're in the process of preparing an update PR for the green commit from Nov. 14, and realize we may be unknowingly duplicating effort with others.

CC recent update PR authors: @chenchongsong @MikeHolman @tungld

@tungld
Copy link
Collaborator

tungld commented Dec 6, 2022

We're in the process of preparing an update PR for the green commit from Nov. 14, and realize we may be unknowingly duplicating effort with others.

There is no duplicating effort from our side as far as I know. Thanks!

@ljfitz
Copy link
Contributor

ljfitz commented Dec 8, 2022

I see contributors to the torch-mlir project have organized a rotation scheme for managing updates to green commits: https://github.com/llvm/torch-mlir/wiki/Weekly-LLVM-Update, as a result of discussion to see if there was enough support.

I see the topic of updating was discussed in https://github.com/onnx/onnx-mlir/wiki/Informal-meeting-agenda-and-notes#september-13rd-2022 and wonder if there was any support for adopting a more predictable approach?

@sstamenova
Copy link
Collaborator Author

I think that's a great idea. Considering how complex some of the updates get, I'd recommend every couple of weeks as a good cadence. We can probably have 1 or 2 people from our side participate in the rotation as well. @AlexandreEichenberger what do you think?

@sstamenova
Copy link
Collaborator Author

@AlexandreEichenberger : Can we add this to the agenda for the community meeting in the next couple of weeks if it hasn't been discussed yet?

@AlexandreEichenberger
Copy link
Collaborator

@christopherbate @gargaroff

We are trying to get two folks from each actively participating company to help out with the LLVM update. The process has been pretty smooth the last couple of iterations, largely because by doing it regularly, it reduces the amount of work to do for each iteration.

Would you consider another volunteer from NVIDIA and AMD? That would be greatly appreciated.

Big thanks for helping out.

You can select your preferred time here: https://github.com/onnx/onnx-mlir/wiki/LLVM-Update-Schedule

@gargaroff
Copy link
Collaborator

@AlexandreEichenberger I could do another bump in the week of 2023-05-01.

@sorenlassen
Copy link
Member

@chenchongsong @tungld I saw in PR #2252 you bumped the LLVM commit to a "non standard" commit

I think we need to be cautious about that because it brings us out of sync with torch-mlir, which I understood to be a key motivation for our LLVM updates process

hopefully it doesn't create any problems but in the future please post here in the master issue to build consensus before we diverge from the commits from llvm/torch-mlir#1178

@hamptonm1
Copy link
Collaborator

@chenchongsong @tungld I saw in PR #2252 you bumped the LLVM commit to a "non standard" commit

I think we need to be cautious about that because it brings us out of sync with torch-mlir, which I understood to be a key motivation for our LLVM updates process

hopefully it doesn't create any problems but in the future please post here in the master issue to build consensus before we diverge from the commits from llvm/torch-mlir#1178

@chenchongsong We should only be using the green commits mentioned by @ashay to keep things in sync. I actually think the recent issues @cjvolzka and @negiyas are seeing maybe related to this. I am going to revert the LLVM commit to d421f5226048e4a5d88aab157d0f4d434c43f208 to see if this solves the issue. The below issue just started to occur with the recent LLVM bump:

$ ./build/Debug/bin/onnx-mlir -mcpu=z14 --EmitONNXIR  bidaf-9-nocategorymaper-onnxbasic.mlir
onnx-mlir: /home1/negishi/src/dlc.git/llvm-project/llvm/include/llvm/Support/Casting.h:566: decltype(auto) llvm::cast(const From&) [with To = mlir::RankedTensorType; From = mlir::Type]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
./compile-bidaf9.sh: line 76: 642957 Aborted                 (core dumped) /home1/negishi/src/dlc.git/onnx-mlir.main/build/Debug/bin/onnx-mlir -mcpu=z14 --EmitONNXIR bidaf-9-nocategorymaper-onnxbasic.mlir

@gongsu832 gongsu832 unpinned this issue Jan 11, 2024
@hamptonm1 hamptonm1 pinned this issue Jan 11, 2024
@hamptonm1 hamptonm1 unpinned this issue Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants